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IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 

In re Application of: 

DAHLQUISTetal. 

Serial No. Not yet assigned 

Filed: With Application 

For A NEW CLASS OF ENZYMES IN THE BIOSYNTHETIC PATHWAY FOR THE 
PRODUCTION OF TRIACYLGLYCEROL AND RECOMBINANT DNA 
MOLECULES ENCODING THESE ENZYMES 

PRELIMINARY AMENDMENT 

Hon. Commissioner of Patents and Trademarks 
Wasliington, D.C. 20231 

Sir: 

Prior to examination, kindly amend the above-identified application as follows: 
IN THE SPECIFICATION 

Page 1, after the title, insert 

-This is a continuation of provisional application Serial No. 60/180,687, filed February 
7, 2000.- 
IN THE CLAIMS 

Claim 3, line 1, delete "claims 1 or 2" and insert -claim 1~. 

Claim 4, line 1, delete "claims 1 to 3 and insert -claim 1-. 

Claim 5, line 1, delete "claims 1 to 4" and insert -claim 1-. 

Claim 6, line 1, delete "claims 1 to 5" and insert -claim 1-. 

Claim 9, line 1, delete "claims 7 or 8" and insert -claim 7-. 

Claim 10, line 2, delete "claims 7 to 9" and insert -claim 7-. 

Claim 1 1 , line 1 , delete "claims 7 to 10" and insert -claim 7-. 

1 



Group Art: 
Examiner 



Claim 12, lines 1 and 2, delete "claims 7 to 11" and insert -claim 7- - 
Claim 13, lines 1 and 2, delete "claims 7 to 11 or a gene construct according to claim 
12" and insert -claim 7-. 



Claim 


15, 1 


ine 


1, delete "claims 13 or 14" and insert 


--claim 13- 


Claim 


16, 1 


ine 


2, delete "claims 7 to 1 1" and insert - 


-claim 7- -; 




1 


ine 


3, delete "to 15" and insert -28-. 




Claim 


18, 1 


ine 


1, delete "claims 16 or 17" and insert 


-claim 16- 


Claim 


19, 1 


ine 


1, delete "claims 16 to 18" and insert 


-claim 16-. 


Claim 


20, 1 


ine 


1, delete "claims 16 to 19" and insert 


-claim 19-. 


Claim 


21,1 


ine 


1, delete "claims 16 to 20" and insert 


-claim 16-. 


Claim 


22, 1 


ine 


1, delete "claims 16 to 21" and insert 


-claim 16-. 


Claim 


23, I 


ine 


1, delete "claims 16 to 22" and insert 


-claim 16-. 


Claim 


25,1 


ine 


1, change "24" to —30-. 




Claim 


15,1 


ine 


1, delete "24" and insert -20-. 




Cancel cla 


ims 


24, 26 and 27. 





Insert the following new claims. 

-28. A vector comprising the gene construct of claim 12. 

29. A vector according to claim 28, further comprising a selectable master gene and/or 
nucleotide sequences for the replication in a host cell or the integration into the genome of the 
host cell. 

30. A process for the production of triacylglycerol, comprising growing a transgenic cell 
or organism according to claim 16 under conditions whereby a nucleotide sequence encoding 
an enzyme catalyzing in an acyl-CoA-independent reaction the transfer of fatty acids from 
phospholipids to diacylglycerol in the biosynthetic pathway for the production of triacylglycerol 
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is expressed and whereby said transgen+ic cells comprising an enzyme catalyzing in a acyl- 
CoA-independent reaction the transfer of fatty acids from phospholipids to diacylglycerol in the 
biosynthetic pathway for the production of triacylglyceroL 

31. A method of producing triacyiglycerol and/or triacylglycero! with uncommon fatty 
acids which comprises transforming an organism or host cell using the nucleotide sequence of 
claim 7, whereby the transformation results in an altered, preferably, increased oil content of 
the cell or organism. 

32. A method of producing triacyiglycerol and/or triacylglycerols with uncommon fatty 
acids using the nucleotide sequence of claim 7. 

33. A method of producing triacyiglycerol and/or triacylglycerols with uncommon fatty 
acids using the enzyme of claim 1. 



The claims have been amended to put the application in better form for U.S. filing. No 
new matter has been added. 

Entry of the above amendment is respectfully solicited. 



REMARKS 



Respectfully submitted, 



KEIL&WEINKAUF 




Herbert B. K6W 
Reg. No. 18,967 



1101 Connecticut Ave., N.W. 
Washington, D.C. 20036 
(202)659-0100 



3 



NAE 3377/99 l^S 1 16.03.00 

A NEW CLASS OF ENZYMES IN THE BIOSYNTHETIC PATHWAY FOR THE 
PRODUCTION OF TRIACYLGLYCEROL AND RECOMBINANT DNA 
MOLECULES ENCODING THESE ENZYMES 

\ 

5 The present invention relates to the isolation, identification and characterization 
of recombinant DNA molecules encoding enzymes catalysing the transfer of 
fatty acids from phospholipids to diacylglycerol in the biosynthetic pathway for 
the production of triacylglycerol. 

10 Triacylglycerol (TAG) is the most common lipid-based energy reserve in nature. 
The main pathway for synthesis of TAG is believed to involve three sequential 
acyl-transfers from acyl-CoA to a glycerol backbone (1, 2). For many years, 
acyl-CoA : diacylglycerol acyltransferase (DAGAT), which catalyzes the third 
acyl transfer reaction, was thought to be the only unique enzyme involved in 

15 TAG synthesis. It acts by diverting diacylglycerol (DAG) from membrane lipid 
synthesis into TAG (2). Genes encoding this enzyme were recently identified 
both in the mouse (3) and in plants (4, 5), and the encoded proteins were 
shown to be homologous to acyl-CoA : cholesterol acyltransferase (ACAT). It 
was also recently reported that another DAGAT exists in the oleaginous fungus 

20 Mortierella ramanniana, which Is unrelated to the mouse DAGAT, the ACAT 
gene family or to any other known gene (6). 

The instant invention relates to novel type of enzymes and their encoding 
genes for transformation. More specifically, the invention relates to use of a 

25 type of genes encoding a not previously described type of enzymes hereinafter 
designated phospholipid:diacylglycerol acyltransferases (PDAT), whereby this 
enzyme catalyses an acyl-CoA-independent reaction. The said type of genes 
expressed alone in transgenic organisms will enhance the total amount of oil 
(triacylglycerols) produced in the cells. The PDAT genes, in combination with a 

30 gene for the synthesis of an uncommon fatty acid will, when expressed in 
transgenic organisms, enhance the levels of the uncommon fatty acids in the 
triacylglycerols. 
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There is considerable interest world-wide in producing chemical feedstock, 
such as fatty acids, for Industrial use from renewable plant resources rather 
than non-renewable petrochemicals. This concept has broad appeal to 
5 manufacturers and consumers on the basis of resource conservation and 
provides significant opportunity to develop new industrial crops for agriculture. 

There is a diverse array of unusual fatty acids in oils from wild plant species 
and these have been well characterised. Many of these acids have industrial 
10 potential and this has led to interest in domesticating relevant plant species to 
enable agricultural production of particular fatty acids. 

Development in genetic engineering technologies combined with greater 
understanding of the biosynthesis of unusual fatty acids now makes it possible 
15 to transfer genes coding for key enzymes involved in the synthesis of a 
particular fatty acid from a wild species into domesticated oilseed crops. In this 
way individual fatty acids can be produced in high purity and quantities at 
moderate costs. 

20 In all crops like rape, sunflower, oilpalm etc., the oil (i.e. triacylglycerols) is the 
most valuable product of the seeds or fruits and other compounds like starch, 
protein, and fibre is regarded as by-products with less value. Enhancing the 
quantity of oil per weight basis at the expense of other compounds in oil crops 
would therefore increase the value of crop. If genes, regulating the allocation of 

25 reduced carbon into the production of oil can be up-regulated, the cells will 
accumulate more oil on the expense af other products. Such genes might not 
only be used in already high oil producing cells such as oil crops but could also 
induce significant oil production in moderate or low oil containing crops such as 
e.g. soy, oat, maize, potato, sugarbeats, and turnips as well as in micro- 

30 organisms. 
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Summary of the invention 

Many of the unusual fatty acids of interest, e.g. medium chain fatty acids, 
hydroxy fatty acids, epoxy fatty acids and acetylenic fatty acids, have physical 

5 properties that are distinctly different from the common plant fatty acids. The 
present inventors have found that, in plant species naturally accumulating 
these uncomman fatty acids in their seed oil (i.e. triacylglycerol), these acids 
are absent, or present in very low amounts in the membrane (phospho)lipids of 
the seed. The low concentration of these acids in the membrane lipids is most 

10 likely a prerequisite for proper membrane function and thereby for proper cell 
functions. One aspect of the invention is that seeds of transgenic crops can be 
made to accumulate high amounts of uncommon fatty acids if these fatty acids 
are efficiently removed from the membrane lipids and channelled into seed 
triacylglycerols. 

15 

The inventors have identified a novel class of enzymes in plants catalysing the 
transfer of fatty acids from phospholipids to diacylglycerol in the production of 
triacylglycerol through an acyl-CoA-independent reaction and that these 
enzymes (phospholipid:diacylglycerol acyltransferases abbreviated as PDAT) 
20 are involved in the removal of hydroxylated, epoxygenated fatty acids, and 
probably also other uncommon fatty acids such as medium chain fatty acids, 
from phospholipids in plants. 

This enzyme reaction was shown to be present in microsomal preparations 
from baker's yeast {Saccharomyces cerevisiae). The instant invention further 

25 pertains to an enzyme comprising an amino acid sequence as set forth in SEQ 
ID No. 2 or a functional fragment, derivate, allele, homologue or isoenzyme 
thereof. A so called ,knock out' yeast mutant, disrupted in the respective gene 
was obtained and microsomal membranes from the mutant was shown to 
totally lack PDAT activity. Thus, it was proved that the dismpted gene encodes 

30 for a PDAT enzyme (SEQ ID NO. 1 and 2). 
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The instant invention pertains further to an enzyme comprising an amino acid 
sequence as set forth in SEQ ID NO. 1a, 2b or 5a or a functional fragment, 
derivate, allele, homologue or isoenzyme thereof. 

Further genes and/or peoteins of so far unknown function were identified and 
5 are contemplated within the scope of the instant invention. A gene from 

Schizosaccharomyces pombe, SPBC776.14 (SEQ ID. NO. 3), a putative open 

reading frame CAA22887 of the SPBC776.14 (SEQ ID NO. 13) were identified. 

Further Arabidopsis thaliana genomic sequences (SEQ ID NO. 4, 10 and 11) 

coding for putative proteins were identified, as well as a putative open reading 
10 frame AAC80628 from the A. thaliana locus AG 004557 (SEQ ID NO. 14) and a 

putative open reading frame AAD10668 from the A. thaliana locus AC 003027 

(SEQ ID NO. 15) were identified. 

Also, a partially sequenced cDNA clone from Neurospora crassa (SEQ ID NO. 
9) and a Zea mays EST (Extended Sequence Tac) clone (SEQ ID NO. 7) and 

15 corresponding putative amino acid sequence (SEQ ID NO. 8) were identified. 
Finally, two cDNA clones were identified, one Arabidopsis thaliana EST (SEQ 
ID NO. 5 and corresponding predicted amino acid sequence SEQ ID NO. 6) 
and a Lycopersicon esculentum EST clone (SEQ ID NO. 12) were identified. 
Further, enzymes designated as PDAT comprising an amino acid sequence 

20 selected from the group consisting of sequences as set forth in SEQ ID NO 2a, 
3a, 5b, 6 or 7b are contemplated within the scope of the invention. Moreover, 
an enzyme comprising an amino acid sequence encoded through a nucleotide 
sequence, a portion, derivate, allele or homologue thereof selected from the 
group consisting of sequences as set forth in SEQ ID No. 1, 1b, 3, 3b, 4, 4a, 

25 4b, 5, 5b, 6b, 7, 8b, 9, 9b, 10, 10b, 11, lib or 12 or a functional fragment, 
derivate, allele, homologue or isoenzyme of the enzyme encoding amino acid 
sequence are included within the scope of the invention. 

A fuctional fragment of the instant enzyme is understood to be any polypeptide 
30 sequence which shows specific enzyme activity of a phospholipididiacylglycerol 
acyltransferase (PDAT). The length of the functional fragment can for example 
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vary in a range from about 660 ± 10 amino acids to 660 ± 250 amino acids, 
preferably from about 660 ± 50 to 660 ±100 amino acids, wliereby tlie „basic 
number*' of 660 amino acids corresponds in tiiis case to the polypeptide chain 
of the PDAT enzyme of SEQ ID NO. 2 encoded by a nucleotide sequence 
5 according to SEQ ID NO. 1 . Consequently, the „basic number** of functional 
fullength enzyme can vary in correspondance to the encoding nucleotide 
sequence. 

A portion of the instant nucleotide sequence is meant to be any nucleotide 
sequence encoding a polypeptid which shows specific activity of a 

10 phospholipid:diacylglycerol acyltransferase (PDAT). The length of the 
nucleotide portion can vary in a wide range of about several hundereds of 
nucleotides based upon the coding region of the gene or a highly conserved 
sequnence. For example the length varies in a range form about 1900 ± 10 to 
1900 ± 1000 nucleotides, preferably form about 1900 ± 50 to 1900 ±700 and 

15 more preferably form about 1900 ± 100 to 1900 ± 500 nucleotides, whereby the 
„basic number** of 1900 nucleotdies corresponds in this case to the encoding 
nucleotide sequence of the PDAT enzyme of SEQ ID NO. 1 . Consequently, the 
„basic number** of functional fullength gene can vary. 

20 An allelic variant of the instant nucleotide sequence is understood to be any 
different nucleotide sequence which encodes a polypeptide with a functionally 
equivalent function. The alleles pertain naturally occuring variantes of the 
instant nucleotide sequences as well as synthetic nucleotide sequences 
produces by methods known in the art. Contemplated are even altered 

25 nucleotide sequences which result in an enzyme with altered activity and/or 
regulation or which is resistant against specific inhibitors. The instant invention 
further includes natural or synthetic mutations of the originally isolated 
nucleotide sequences. These mutations can be substitution, addition, deletion, 
inversion or insertion of one or more nucleotides. 

30 
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A homologues nucleotide sequence is understood to be a complementary 
sequence and/or a sequence \which specifically hybridizes with the instant 
nucleotide sequence. Hybridizing sequences include similar sequences 
selected from the group of DNA or RNA which specifically interact to the instant 

5 nucleotide sequences under at least moderate stringency conditions which are 
known in the art. A preferred, non-limiting example of stringent hybridization 
conditions are hybridization in 6X sodium chloride/sodium citrate (SSC) at 
about 45°C, followed by one or more washes in 0.2 X SSC, 0.1% SDS at 50- 
65°C. This further includes short nucleotide sequences of e.g. 10 to 30 

10 nucleotides, preferably 12 to 15 nucleotides. Included are also primer or 
hybridization probes. 

A homologue nucleotide sequence included within the scope of the instant 
invention is a sequence which is at least about 40%, preferably at least about 
15 50 % or 60%, and more preferably at least about 70%, 80% or 90% and most 
preferably at least about 95%, 96%, 97%, 98% or 99% or more homologous to 
a nucleotide sequence of SEQ ID NO. 1 . 

All of the aforementioned definitions are true for amino acid seqences and 
functional enzymes and can easily transfered by a person skilled in the art. 

20 

Isoenzymes are understood to be enzymes which have the same or a similar 
substrate specifity and/or catalytic activity but a different primary structure. 

In a first embodiment, this invention is directed to nucleic acid sequences that 
25 encode a PDAT. This includes sequences that encode biologically active 
PDATs as well as sequences that are to be used as probes, vectors for 
transformation or cloning intermediates. The PDAT encoding sequence may 
encode a complete or partial sequence depending upon the intended use. All or 
a portion of the genomic sequence, cDNA sequence, precursor PDAT or 
30 mature PDAT is intended. 
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Further included is a nucleotide sequence selected from the group consisting of 
sequences set forth in SEQ ID No. 1 , 1b, 3, 3b, 4, 4a, 4b, 9b, 10, 10b or 1 1 or a 
portion, derivate, allele or homologue thereof. The invention pertains a partial 

5 nucleotide sequence corresponding to a fullength nucleotide sequence selected 
from the group consisting of sequences set forth in SEQ ID No. 5, 5b, 6b, 7, 8b, 
9, lib or 12 or a portion, derivate, allele or homologue thereof. Moreover, a 
nucleotide sequence comprising a nucleotide sequence which is at least 40% 
homologous to a nucleotide sequence selected form the group consisting of 

10 those sequences set forth in SEQ ID No. 1 1 b, 3, 3b, 4, 4a, 4b, 5, 5b, 6b, 7, 8b, 
9, 9b, 10, 10b, 11, 1 1 b or 12 is contemplated within the scope of the invention. 

The instant invention pertains to a gene construct comprising a said nucleotide 
sequences of the instant invention which is operabiy linked to a heterologous 
15 nucleic acid. 

The term operabiy linked means a serial organisation e.g. of a promoter, coding 
sequence, terminator and/or further regulatory elements whereby each element 
can fulfill its original function during expression of the nucleotide sequence. 

20 Further, a vector comprising of a said nucleotide sequence of the instant 
invention is contemplated in the instant invention. This includes also an 
expression vector as well as a vector further comprising a selectable marker 
gene and/or nucleotide sequences for the replication in a host cell and/or the 
integration into the genome of the host cell. 

25 

In a different aspect, this invention relates to a method for producing a PDAT in 
a host celt or progeny thereof, including genetically engineered oil seeds, yeast 
and moulds or any other oil accumulating organism, via the expression of a 
construct in the cell. Cells containing a PDAT as a result of the production of 
30 the PDAT encoding sequence are also contemplated within the scope of the 
invention. 
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Further, the invention pertains a transgenic cell or organism containing a said 
nucleotide sequence and/or a said gene construct and/or a said vector. The 
object of the instant invention is further a transgenic cell or organism which is 
an eucaryotic cell or organism. Preferably, the transgenic cell or organism is a 
yeast cell or a plant cell or a plant. The instant invention further pertains said 
transgenic cell or organism having an altered biosynthetic pathway for the 
production of triacylglycerol. A transgenic cell or organism having an altered oil 
content is also contemplated within the scope of this invention. 

Further, the invention pertains a transgenic cell or organism wherein the activity 
of PDAT is altered in said cell or organism. This altered activity of PDAT is 
characterized by an alteration in gene expression, catalytic activity and/or 
regulation of activity of the enzyme. Moreover, a transgenic cell or organism is 
included in the instant invention, wherein the altered biosynthetic pathway for 
the production of triacylglycerol is characterized by the prevention of 
accumulation of undesirable fatty acids in the membrane lipids. 

In a different embodiment, this invention also relates to methods of using a 
DNA sequence encoding a PDAT for increasing the oil-content within a cell. 

Another aspect of the invention relates to the accommodation of high amounts 
of uncomman fatty acids in the triacylglycerol produced within a cell, by 
introducing a DNA sequence producing a PDAT that specifically removes these 
fatty acids from the membrane lipids of the cell and channel them into 
triacylglycerol. Plant cells having such a modification are also contemplated 
herein. 

Further, the invention pertains a process for the production of triacylglycerol, 
comprising growing a said transgenic cell or organism under conditions 
whereby the said nucleotide sequence is expressed and whereby the said 
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transgenic cells comprising an said enzyme catalysing the transfer of fatty 
acids from phospholipids to diacylglycerol forming triacyiglycerol. 

Moreover, triacylglycerols produced by the aforementioned process are 
5 included in scope of the instant invention. 

Object of the instant invention is further the use of an instant nucleotide 
sequence and/or a said enzyme for the production of triacyiglycerol and/or 
triacylglycerols with uncommon fatty acids. The use of a said instant nucleotide 
10 sequence and/or a said enzyme of the instant invention for the transformation 
of any cell or organism in order to be expressed in this cell or organism and 
result in an altered, preferably increased oil content of this cell or organism is 
also contemplated within the scope of the instant invention. 

15 A PDAT of this invention includes any sequence of amino acids, such as a 
protein, polypeptide or peptide fragment obtainable from a microorganism, 
animal or plant source that demonstrates the ability to catalyse the production 
of triacyiglycerol from a phospholipid and diacylglycerol under enzyme reactive 
conditions. By „enzyme reactive conditions,, is meant that any necessary 

20 conditions are available in an environment (e.g., such factors as temperature, 
pH, lack of inhibiting substances) which will perniit the enzyme to function. 

Other PDATs are obtainable from the specific sequences provided herein. 
Furthermore, it will be apparent that one can obtain natural and synthetic 

25 PDATs, including modified amino acid sequences and starting materials for 
synthetic-protein modelling from the exemplified PDATs and from PDATs 
which are obtained through the use of such exemplified sequences. Modified 
amino acid sequences include sequences that have been mutated, truncated, 
increased and the like, whether such sequences were partially or wholly 

30 synthesised. Sequences that are actually purified from plant preparations or 
are identical or encode identical proteins thereto, regardless of the method 
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used to obtain the protein or sequence, are equally considered naturally 
derived. 

Further, the nucleic acid probes {DNA and RNA) of the present invention can 
5 be used to screen and recover ..homologous,, or ..related,, PDATs from a 
variety of plant and microbial sources. 

Further, it is also apparent that a person skilled in the art can, with the 
information provided in this application, in any organism identify a PDAT 
10 activity, purify an enzyme with this activity and thereby identify a „non- 
homologues" nucleic acid sequence encoding such an enzyme. 

The present invention can be essentially characterized by the following 
aspects: 

15 

1 . Use of a PDAT gene (genomic clone or cDNA) for transformation. 

2. Use of a DNA molecule according to item 1 wherein said DNA is used for 
transformation of any organism in order to be expressed in this organism 
and result in an active recombinant PDAT enzyme in order to increase oil 

20 content of the organism. 

3. Use of a DNA molecule of item 1 wherein said DNA is used for 
transformation of any organism in order to prevent the accumulation of 
undesirable fatty acids in the membrane lipids. 

4. Use according to item 1 , wherein said PDAT gene is used for transforming 
25 transgenic oil accumulating organisms engineered to produce any 

uncommon fatty acid which is harmful if present in high amounts in 
membrane lipids, such as medium chain fatty acids, hydroxylated fatty 
acids, epoxygenated fatty acids and acetylenic fatty acids. 

5. Use according to item 1 , wherein said PDAT gene is used for transforming 
30 organisms, and wherein said organisms are crossed with other oil 

accummulating organisms engineered to produce any uncommon fatty acid 
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which is harmful if present In high amounts in membrane lipids, comprising 
medium chain fatty acids, hydroxylated fatty acids, epoxygenated fatty 
acids and acetylenic fatty acids. 

6. Use according to item 1 , wherein the enzyme encoded by said PDAT gene 
5 or cDNA is coding for a PDAT with distinct acyl specificity. 

7. Use according to item 1 wherein said PDAT encoding gene or cDNA, is 
derived from Saccharornyces cereviseae, or contain nucleotide sequences 
coding for an amino acid sequence 30% or more identical to the amino acid 
sequence of PDAT as presented in SEQ. ID. NO. 2. 

10 8. Use according to item 1 wherein said PDAT encoding gene or cDNA is 
derived from Saccharornyces cereviseae, or contain nucleotide sequences 
coding for an amino acid sequence 40% or more identical to the amino acid 
sequence of PDAT as presented in SEQ. ID. NO. 2. 

9. Use according to item 1 wherein said PDAT encoding gene or cDNA is 
15 derived from Sacclnarornyces cereviseae, or contain nucleotide sequences 

coding for an amino acid sequence 60% or more identical to the amino acid 
sequence of PDAT as presented in SEQ. ID. NO. 2. 

10. Use according to item 1 wherein said PDAT encoding gene or cDNA is 
derived from Saccharornyces cereviseae, or contain nucleotide sequences 

20 coding for an amino acid sequence 80% or more identical to the amino acid 
sequence of PDAT as presented in SEQ. ID. NO. 2. 

1 1 . Use according to item 1 wherein said PDAT encoding gene or cDNA is 
derived from plants or contain nucleotide sequences coding for an amino 
acid sequence 40% or more identical to the amino acid sequence of PDAT 

25 from Arabidopsis thaliana or to the protein encoded by the fullength 
counterpart of the partial Zea mays, Lycopericon esculentum, or 
Neurospora crassa cDNA clones. 

12. Transgenic oil accumulating organisms comprising, in their genome, a 
PDAT gene transferred by recombinant DNA technology or somatic 

30 hybridization. 
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13. Transgenic oil accumulating organisms according to item 12 comprising, in 
their genome, a PDAT gene leaving specificity for substrates with a 
particular uncommon fatty acid and the gene for said uncommon fatty acid, 

14. Transgenic organisms according to item 12 or 13 which are selected from 
5 the group consisting of fungi, plants and animals, 

15. Transgenic organisms according to item 12 or 13 which are selected from 
the group of agricultural plants. 

16. Transgenic organisms according to item 12 or 13 which are selected from 
the group of agricultural plants and where said PDAT gene is expressed 

10 under the control of a storage organ specific promoter. 

17. Transgenic organisms according to item 12 or 13 which are selected from 
the group of agricultural plants and where said PDAT gene is expressed 
under the control of a seed promoter. 

18. Oils from organisms according to item 12-17. 

15 19. A method for altering acyi specificity of a PDAT by alteration of the 
nucleotide sequence of a naturally occurring encoding gene and as a 
consequence of this alternation creating a gene encoding for an enzyme 
with novel acyl specifity. 

20. A protein encoded by a DNA molecule according to item 1 or a functional 
20 fragment thereof. 

21. A protein of item 20 designated phospholipid:diacylglycerol acyltransferase. 

22. A protein of item 21 which has a distinct acyl specificity. 

23. A protein of item 13 having the amino acid sequence as set forth in SEQ, 
ID NO. 2, 13, 14 or 15 (and the proteins encoded by the fullength or partial 

25 genes set forth in SEQ. ID. NO. 1, 3, 4, 5, 7, 9, 10, 11 or 12) or an amino 

acid sequence with at least 30 % homology to said amino acid sequence. 

24. A protein of item 23 isolated from Saccharomyces cereviseae, 

30 General methods: 
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Yeast strains and plasmids . The wild type yeast strains used were either 
FY1679 (MATa his3-A200 Ieu2-A1 trp1-A6 ura3'52) or W303-1A {MATa ADE2' 
1 can1-100 his3-11,15 !eu2-3,112 trp1'1 ura3-1) (7). The YNR008w::KanMX2 
disruption strain R/KT004-04C(AL), which is congenic to FV1679, was 

5 obtained from the Eurosoarf collection (8). A 2751 bp fragment containing the 
YNROOBw gene with 583 bp of 5' and 183 bp of 3' flanking DNA was amplified 
from W303-1A genomic DNA using Taq polymerase with 5'- 
TCTCCATCTTCTGCAAAACCT-3' and 5'-CCTGTCAAAAACCTTCTCCTC-3' as 
primers. The resulting PGR product was purified by agarose gel electrophoresis 

10 and cloned into the EcoRV site of pBluescript (pbluescript-pdat). For 
complementation experiments, the cloned fragment was released from 
pBluescript by H/ndlll-SacI digestion and then cloned between the H}n6\\\ and 
Sad sites of pFL39 (9), thus generating pUSI . For overexpression of the PDAT 
gene, a 2202 bp EcoRl fragment from the pBluscript plasmid which contains 

15 only 24 bp of 5' flanking DNA was cloned into the BamHI site of the GAU- 
rPK2 expression vector pJN92 (12), thus generating pUS4. 

Microsomal preparations. Microsomes from developing seeds of sunflower 
{IHelianthus annuus), Ricinus communis and Crepis palaestina were prepared 

20 using the procedure of Stobart and Stymne (11). To obtain yeast microsomes, 
1 g of yeast cells (fresh weight) was re-suspended in 8 ml of ice-cold buffer (20 
mM Tris-Cl, pH 7.9, 10 mM MgCIs, 1 mM EDTA, 5 % (v/v) glycerol, 1 mM DTT, 
0.3 M ammonium sulfate) in a 12 ml glass tube. To this tube, 4 ml of glass 
beads (diameter 0.45-0.5 mm) were added, and the tube was then heavily 

25 shaken (3 x 60 s) in an MSK cell homogenizer (B. Braun Melsungen AG, 
Germany). The homogenized suspension was centrifuged at 20,000 x g for 15 
min at 6°C and the resulting supernatant was again centrifuged at 100,000 x g 
for 2 h at 6°C. The 100,000 x g pellet was resuspended in 0.1 M potassium 
phosphate (pH 7.2), and stored at -80°G. it is subsequently referred to as the 

30 crude yeast microsomal fraction. 
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Lipid substrates. Radio-labeled ricinoleic (1 2-hydroxy-octadecenoic) and 
vernolic (12,13-epoxy-octadecenoic) acids were synthesized enzymatically from 
[l-^^'Cloleic acid and [l-^'^Cllinoleic acid, respectively, by incubation with 
microsomal preparations from seeds of Ricinus communis and Crepis 
palaestina, respectively (12). The synthesis of phosphatidylcholines (PC) or 
phosphatidylethanolamines (PE) with ^"^C-labeled acyl groups in the sn-2 
position was performed using either enzymatic (13), or synthetic (14) acylation 
of [^^C]oleic, [^^qricinoleic, or [^^C]vernolic acid. Dioleoyl-PC that was labeled 
in the sn-1 position was synthesized from sn-1 -[^"^Cloleoyl-lyso-PC and 
unlabeled oleic acid as described in (14). Sn-1-oleoyl-sn-2-[^^C]ricinoleoyl-DAG 
was synthesized from PC by the action of phospholipase C type XI from B. 
Cereus (Sigma Chemical Co.) as described in (15). Monovemoloyl- and 
divemoIeoyl-DAG were synthesized from TAG extracted from seeds of 
Euphorbia lagascae, using the TAG-lipase (Rizhopus arrhizus, Sigma Chemical 
Co.) as previously described (16). Monoricinoleoyl-TAG was synthesized 
according to the same method using TAG extracted from Castor bean. 

Lipid analysis. Total lipid composition of yeast were determined from cells 
han/ested from a 40 ml liquid culture, broken in a glass-bead shaker and 
extracted into chloroform as described by Bligh and Dyer (17), and then 
separated by thin layer chromatography in hexane/diethylether/acetic acid 
(80:20:1) using pre-coated silica gel 60 plates (Merck). The lipid areas were 
located by brief exposure to h vapors and identified by means of appropriate 
standards. Polar lipids, sterol-esters and triacylglycerols, as well as the 
remaining minor lipid classes, referred to as other lipids, were excised from the 
plates. Fatty acid methylesters were prepared by heating the dry excised 
material at 85 °C for 60 min in 2% (v/v) sulfuric acid in dry methanol. The 
methyl esters were extracted with hexane and analyzed by GLC through a 50 m 
X 0.32 mm CP-Wax58-CB fused-silica column (Chrompack), with 
methylheptadecanoic acid as an internal standard. The fatty acid content of 
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each fraction was quantified and used to calculate the relative amount of each 
lipid class. In order to determine the total lipid content, 3 ml aliquots from yeast 
cultures were harvested by centrifugation and the resulting pellets were washed 
with distilled water and lyophilized. The weight of the dried cells was 
5 determined and the fatty acid content was quantified by GLC-analyses after 
conversion to methylesters as described above. The lipid content was then 
calculated as nmol fatty acid (FA) per mg dry weight yeast. 

Enzyme assays. Aliquots of crude microsomal fractions (corresponding to 
10 10 nmol of microsomal PC) from developing plant seeds or yeast cells were 
lyophilized over night. ^^C-Labeled substrate lipids dissolved in benzene were 
then added to the dried microsomes. The benzene was evaporated under a 
stream of N2, leaving the lipids in direct contact with the membranes, and 0.1 
ml of 50 mM potassium phosphate (pH 7.2) was added. The suspension was 
15 thoroughly mixed and incubated at SOX for the time period indicated, up to 90 
min. Lipids were extracted from the reaction mixture using chloroform and 
separated by thin layer chromatography in hexane/diethylether/acetic acid 
(35:70:1.5) using silica gel 60 plates (Merck). The radioactive lipids were 
visualized and quantified on the plates by electronic autoradiography (Instant 
20 Imager, Packard, US). 

Yeast cultivation. Yeast cells were grown at 28°C on a rotatory shaker in 
liquid YPD medium (1% yeast extract, 2% peptone, 2% glucose), synthetic 
medium (18) containing 2% (v/v) glycerol and 2% (v/v) ethanol, or minimal 
25 medium (1 9) containing 1 6 g/1 of glycerol. 

The instant invention is further characterized by the following examples which 
are not limiting: 

30 Acyl-CoA-indeoendent synttiesis of TAG by oil seed microsomes. A large 

number of unusual fatty acids can be found in oil seeds (20). Many of these 
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fatty acids, such as ricinoleic (21) and vernolic acids (22), are synthesized 
using phosphatidylcholln (PC) with oleoyi or llnoleoyl groups esterlfied to the 
sn-2 position, respectively, as the immediate precursor. However, even though 
PC can be a substrate for unusual fatty acid synthesis and is the major 
5 membrane lipids in seeds, unusual fatty acids are rarely found in the 
membranes. Instead, they are mainly incorporated into the TAG. A mechanism 
for efficient and selective transfer of these unusual acyl groups from PC into 
TAG must therefore exist in oil seeds that accumulate such unusual fatty acids. 
This transfer reaction was biochemically characterized in seeds from castor 
10 bean {Ricinus communis) and Crepis palaestina, plants which accumulate high 
levels of ricinoleic and vernolic acid, respectively, and sunflower {Helianthus 
annuus), a plant which has only common fatty acids in its seed oil. Crude 
microsomal fractions from developing seeds were incubated with PC having 
^*C-labeled oleoyi, ricinoleoyi or vernoloyi groups at the sn-2 position. After the 
15 incubation, lipids were extracted and analyzed by thin layer chromatography. 
We found that the amount of radioactivity that was incorporated into the neutral 
lipid fraction increased linearly over a period of 4 hours (data not shown). The 
distribution of [^''C]acyl groups within the neutral lipid fraction was analyzed 
after 80 min (Fig. 1). Interestingly the amount and distribution of radioactivity 
20 between diffferent neutral lipids were strongly dependent both on the plant 
species and on the type of f^Cjacyl chain. Thus, sunflower microsomes 
incorporated most of the label into DAG, regardless of the type of [^^Clacyl 
group. In contrast, R communis microsomes preferentially incorporated 
[^*C]ricinoleoyl and [^^C]vernoloyl groups into TAG, while [^"^CJoleyl groups 
25 mostly were found in DAG. C. palaestina microsomes, finally, incorporated only 
[^"^Cjvernolyol groups into TAG, with [^^C]ricinoleyl groups being found mostly 
as free fatty acids, and [^*C]oleyl groups in DAG. This shows that the high in 
vivo levels of ricinoleic acid and vernolic acid in the TAG pool of R. communis 
and C. palaestina, respectively, can be explained by an efficient and selective 
30 transfer of the corresponding acyl groups from PC to TAG in these organisms. 
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The in-vitro synthesis of triacylglycerols in microsomal preparations of 
developing castor bean is summarized in table 1 . 

PDAT: a novel enzvme that catalyzes acvl-CoA independ ent synthesis of 
5 TAG. It was investigated if DAG could serve both as an acyl donor as well as 
an acyl acceptor in the reactions catalyzed by the oil seed microsomes. 
Thererfore, unlabeled divernoloyl-DAG was incubated with either sn-1-oleoyl- 
sn-2-[^^C]ricinoleoyl-DAG or sn-1-oleoyl-sn-2-[^'^C]ricinoleoyl-PC in the 
presence of R. communis microsomes. The synthesis of TAG molecules 
10 containing both [^'^C]ricinoleoyl and vernoloyl groups was 5 fold higher when 
[^^C]ricinoleoyl-PC served as acyl donor as compared to [^"^qricinoleoyl-DAG 
(fig.lB). These data strongly suggests that PC is the immediate acyl donor and 
DAG the acyl acceptor in the acyl-CoA-independent formation of TAG by oil 
seed microsomes. Therefore, this reaction is catalyzed by a new enzyme which 
15 we call phospholipid : diacylglycerol acyltransferase (PDAT). 

PDAT activity in veast microsomes. Wild type yeast cells were cultivated 
under conditions where TAG synthesis is induced. Microsomal membranes 
were prepared from these cells and Incubated with s/>2-[^'^C]-ricinoleoyl-PC 

20 and DAG and the ^*C-labeied products formed were analyzed. The PC-derived 
[^"^Clricinoleoyl groups within the neutral lipid fraction mainly were found in free 
fatty acids or TAG, and also that the amount of TAG synthesized was 
dependent on the amount of DAG that was added to the reaction (Fig.2). The In 
vitro synthesis of TAG containing both ricinoleoyi and vernoloyl groups, a TAG 

25 species not present in vivo, from exogenous added sn-2-[^''C]ricinoleoyl-PC and 
unlabelled vernoloyl-DAG {Fig. 2, lane 3) clearly demonstrates the existence of 
an acyl-CoA-independent synthesis of TAG involving PC and DAG as 
substrates in yeast microsomal membranes. Consequently, TAG synthesis in 
yeast can be catalyzed by an enzyme similar to the PDAT found in plants. 

30 
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The PDAT encodinp gene in yeast 

A gene in the yeast genome (YNROOSw) is known, but nothing is known 
about the function of YNROOSw, except that the gene is not essential for 
growth under normal circumstances. Microsomal membranes were prepared 

5 from the yeast strain FVKT004-04C(AL) (8) in which this gene with unknown 
function had been disrupted, PDAT activity in the microsomes were assayed 
using PC with radiolabeiled fatty acids at the sn-2 position. The activity was 
found to be completely absent in the disruption strain (Fig. 2 lane 4). 
Significantly, the activity could be partially restored by the presence of 

10 YNROOSw on the single copy plasmid pUS1 (Fig. 2 lane 5). Moreover, acyl 
groups of phosphatidylethanoiamine (RE) were efficiently incorporated into 
TAG by microsomes from the wild type strain whereas no Incorporation 
occured from this substrate in the mutant strain. This shows that YNROOSw 
encodes a yeast PDAT which catalyzes the transfer of an acyl group from the 

15 sn-2 position of phospholipids to DAG, thus forming TAG. It should be noted 
that no cholesterol esters were formed from radioactive PC even in 
incubations with added ergosterols, nor were the amount of radioactive free 
fatty acids formed from PC affected by disruption of the YNROOSw gene. This 
demonstrates that yeast PDAT do not have cholesterol ester synthesising or 

20 phospholipase activities. 
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Increased TAG content in veast cells that overexoress PDAT . The effect 
of overexpressing the PDAT-encoding gene was studied by transforming a wild 
type yeast strain with the pUS4 plasmid in which the gene is expressed from 

5 the galactose-induced GAL1:TPK2 promoter. Cells containing the empty 
expression vector were used as a control. The cells were grown in synthetic 
glycerol-ethanol medium, and expression of the gene was induced after either 2 
hours (early log phase) or 25 hours (stationary phase) by the addition of 
galactose. The cells were then incubated for another 21 hours, after which they 

10 were harvested and assays were performed. We found that overexpression of 
PDAT had no significant effect on the growth rate as determined by the optical 
density. However, the total lipid content, measured as total pmol fatty acids per 
mg yeast dry weight, was 47% (log phase) or 29% (stationary phase) higher in 
the PDAT overexpressing strain than in the control. Furthermore, the polar lipid 

15 and sterolester content was unaffected by overexpression of PDAT. Instead, 
the elevated lipid content in these cells is entirely due to an increased TAG 
content (Fig. 3A,B). Thus, the amount of TAG was increased by 2-fold in PDAT 
overexpressing early log phase cells and by 40% in stationary phase cells. It is 
interesting to note that a significant increase in the TAG content was achieved 

20 by overexpressing PDAT even under conditions {i.e. in stationary phase) where 
DAGAT is induced and thus contributes significantly to TAG synthesis. In vitro 
PDAT activity, assayed in microsomes from the, PDAT overexpressing strain 
was 7-fold higher than in the control strain, a finding which is consistent with the 
increased levels of TAG that we observed in vivo (Fig. 3C). These results 

25 clearly demonstrate the potential use of the PDAT gene in increasing the oil 
content in transgenic organisms. 

Substrate specificity of veast PDAT. The substrate specificity of yeast 
PDAT was analyzed using microsomes prepared from the PDAT 
30 overexpressing strain (see Fig. 4). The rate of TAG synthesis, under conditions 
given in figure 4 with di-oleoyl-PC as the acyl-donor, was 0.15 nmol per min 
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and mg protein. With both oleoyl groups of PC labeled it was possible, under 
the given assay conditions, to detect the transfer of 1 1 pmol/min of [^"^Cloleoyl 
chain into TAG and the formation of 15 pmol/min of lyso-PC. In microsomes 
from the PDAT-deficient strain, no TAG at all and only trace amounts of lyso- 
5 PC was detected, strongly suggesting that yeast PDAT catalyses the formation 
of equimolar amounts of TAG and lyso-PC when supplied with PC and DAG as 
substrates. The fact that somewhat more lyso-PC than TAG is formed can be 
explained by the presence of a phospholipase in yeast microsomes, which 
produces lyso-PC and unesterified fatty acids from PC (data not shown). 

10 

The specificity of yeast PDAT for different acyl group positions was 
investigated by incubating the microsomes with di-oleoyl-PC carrying a 
[^"^CJacyl group either at the sn-1 position (Fig. 4A bar 2) or the sn-2 position 
(Fig. 4A bar 3). We found that the major ^^C-labeled product formed in the 

15 former case was lyso-PC, and in the latter case TAG. We conclude that yeast 
PDAT has a specificity for the transfer of acyl groups from the sn-2 position of 
the phospholipid to DAG, thus forming sn-1 -lyso-PC and TAG. Under the given 
assay conditions, trace amounts of ^^C-labelled DAG is formed from the sn-1 
labeled PC by the reversible action of a CDP-choline : choline 

20 phosphotransferase (data not shown). This labeled DAG can then be further 
converted into TAG by the PDAT activity. It is therefore not possible to 
distinguish whether the minor amounts of labeled TAG that is formed In the 
presence of di-oleoyl-PC carrying a [^"^CJacyl group in the sn-1 position, is 
synthesized directly from the sn-1 -labeled PC by a PDAT that also can act on 

25 the sn-1 postion, or if it is first converted to sn-1 -labeled DAG and then acylated 
by a PDAT with strict selectivity for the transfer of acyl groups at the sn-2 
position of PC. Taken together, this shows that the PDAT encoded by 
YNROOSw catalyses an acyl transfer from the sn-2 position of PC to DAG, thus 
causing the formation of TAG and lyso-PC. 

30 
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The substrate specificity of yeast PDAT was furtfier analyzed with respect 
to the headgroup of the acyl donor, the acyl group transferred and the acyl 
chains of the acceptor DAG molecule. The two major membrane lipids of S. 
cerevisiae are PC and PE, and as shown in Fig. 4B (bars 1 and 2), dioleoyl-PE 

5 is nearly 4-fold more efficient than dioleoyl-PC as acyl donor in the PDAT- 
catalyzed reaction. Moreover, the rate of acyl transfer is strongly dependent on 
the type of acyl group that is transferred. Thus, a ricinoleoyi group at the sn-2 
position of PC is 2.5 times more efficiently transferred into TAG than an oleoyi 
group in the same position (Fig. 4B bars 1 and 3). In contrast, yeast PDAT has 

10 no preference for the transfer of vernoloyi groups over oleoyi groups (Fig. 4B 
bars 1 and 4). The acyl chain of the acceptor DAG molecule also affects the 
efficiency of the reaction. Thus, DAG with a ricinoleoyi or a vernoloyi group is a 
more efficient acyl acceptor than dioleoyl-DAG (Fig. 4B bars 1, 5 and 6). Taken 
together, these results clearly show that the efficiency of the PDAT-catalyzed 

15 acyl transfer is strongly dependent on the properties of the substrate lipids. 

PDAT genes. Nucleotide and amino acid sequences of several PDAT 
genes are given as SEQ ID No. 1 through 15. Futher provisional and/or partial 
sequences are given as SEQ ID NO 1a through 5a and lb through 11b, 

20 respectively. One of the Arabidopsis genomic sequences (SEQ ID NO. 4) 
identfied an Arabidopsis EST cDNA clone; T04806. This cDNA clone was fully 
characterised and the nucleotide sequence is given as SEQ ID NO. 5. Based 
on the sequence homology of the T04806 cDNA and the Arabidopsis thaliana 
genomic DNA sequence (SEQ ID NO 4) it is apparent that an additional A is 

25 presesnt at position 417 in the cDNA clone (data not shown). Excluding this 
nucleotide would give the amino acid sequence depicted in SEQ ID NO. 12. 

Increased TAG content in seeds of Arabidopsis ttialiana that express the 
veast PDAT. For the expression of the yeast pdat gene in Arabidopsis thaliana 
30 an EcoRI fragment from the pBluescript-pdat was cloned together with napin 
promoter (26) into the vector pGPTV-KAN (27). A plasmld (pGNapPDAT) 
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having the yeast PDAT gene in the correct orientation was identified and 
transfornned into Agrobacterium tumefaciens. These bacteria were used to 
transform Arabidopsis thaliana Columbia (C-24) plants using the root 
transformation method (28). Plants transfonned with an empty vector were 

5 used as controls. 

First generation seeds (T1) were harvested and germinated on kanamycin 
containing medium. Second generation seeds (T2) were pooled from individual 
plants and their fatty acid contents analysed by quantification of their methyl 
esthers by gas liquid chromatography after methylation of the seeds with 2% 

10 sulphuric acid in methanol at 85 °C for 1 ,5 hours. Quantification was done with 
heptadecanoic acid methyl esters as internal standard. 

From the transformation with pGNapPDAT one T1 plant (26-14) gave raise to 
seven T2 plants of which 3 plants yielded seeds with statistically (in a mean 
difference two-sided test) higher oil content than seeds from T2 plants 
15 generated from T1 plant 32-4 transformed with an empty vector (table 2). 
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Claims 

1 . An enzyme catalysing in an acyl-CoA-independent reaction the transfer of 
fatty acids from phosplnolipids to diacylglycerol in the biosynthetic pathway 

5 for the production of triacyiglycerol. 

2. An enzyme according to claim 1 , comprising an amino acid sequence as 
set forth in SEQ ID No. 2 or a functional fragment, derivate, allele, 
homologue or isoenzyme thereof. 

10 

3. An enzyme according to claims 1 or 2 designated as 
phosphollpid:diacylglycerol acyltransferase (PDAT). 

4. An enzyme according to claims 1 to 3, comprising an amino acid sequence 
15 as set forth in SEQ ID No. 1a, 2b or 5a or a functional fragment, derivate, 

allele, homologue or isoenzyme thereof. 

5. An enzyme according to claims 1 to 4, comprising an amino acid sequence 
selected from the group consisting of sequences as set forth in SEQ ID No. 

20 2a, 3a, 5b, 6, 7b, 8, 13, 14, 15 or a functional fragment, derivate, allele, 
homologue or isoenzyme thereof. 

6. An enzyme according to claims 1 to 5, comprising an amino acid sequence 
encoded through a nucleotide sequence, a portion, derivate, allele or 

25 homologue thereof selected from the group consisting of sequences as set 

forth in SEQ ID No. 1, 1b, 3, 3b, 4, 4a, 4b, 5, 5b, 6b, 7, 8b, 9. 9b, 10, 10b, 
11, 11b, 12 or a functional fragment, derivate, allele, homologue or 
isoenzyme of the enzyme encoding amino acid sequence. 

30 7. A nucleotide sequence encoding an enzyme catalysing in an acyl-CoA- 
independent reaction the transfer of fatty acids from phospholipids to 
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diacylglycerol in the biosynthetic pathway for the production of 
triacylglycerol. 

8. A nucleotide sequence according to claim 7 encoding an enzyme 
5 designated as phospholipid:diacylglycerol acyltransferase (PDAT). 

9. A nucleotide sequence according to claims 7 or 8, selected from the group 
consisting of sequences as set forth in SEQ ID No. 1, 1b, 3, 3b, 4, 4a, 4b, 
9b, 10, 10b or 1 1 or a portion, derivate, allele or homologue thereof. 

10 

10. A partial nucleotide sequence corresponding to a fullength nucleotide 
sequence according to claims 7 to 9, selected from the group consisting of 
sequences as set forth in SEQ ID No. 5, 5b, 6b, 7, 8b, 9, lib or 12 or a 
portion, derivate, allele or homologue thereof. 

15 

1 1. A nucleotide sequence according to claims 7 to 10, comprising a nucleotide 
sequence which is at least 40% homologous to a nucleotide sequence 
selected form the group consisting of those sequences set forth in SEQ ID 
No. 1, lb, 3, 3b, 4, 4a, 4b, 5,5b, 6b, 7, 8b, 9, 9b, 10, 10b, 11, lib or 12. 

20 

12. A gene construct comprising a nucleotide sequence according to claims 7 
to 1 1 operably linked to a heterologous nucleic acid. 

13. A vector comprising a nucleotide sequence according to claims 7 to 1 1 or a 
25 gene construct according to claim 12. 

14. A vector according to claim 13, which is an expression vector. 

15. A vector according to claims 13 or 14, further comprising a selectable 
30 marker gene and/or nucleotide sequences for the replication in a host cell 

or the integration into the genome of the host cell. 
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1 6, A transgenic cell or organism containing a nucleotide sequence according 
to claims 7 to 11 and/or a gene construct according to claim 12 and/or a 
vector according to claims 13 to 15. 

5 

17. A transgenic cell or organism according to claim 16 which is an eucaryotic 
cell or organism. 

18, A transgenic cell or organism according to claims 16 or 17 which is a yeast 
10 cell or a plant cell or a plant. 

19. A transgenic cell or organism according to claims 16 to 18 having an 
altered biosynthetic pathway for the production of triacylglycerol. 

15 20. A transgenic cell or organism according to claims 16 to 19 having an 
altered oil content. 

21. A transgenic cell or organism according to claims 16 to 20 wherein the 
activity of PDAT is altered. 

20 

22. A transgenic cell or organism according to claims 16 to 21 wherein the 
altered activity of PDAT is characterized by an alteration in gene 
expression, catalytic activity and/or regulation of activity of the enzyme. 

25 23. A transgenic cell or organism according to claims 16 to 22 wherein the 
altered biosynthetic pathway for the production of triacylglycerol is 
characterized by the prevention of accumulation of undesirable fatty acids 
in the membrane lipids. 

30 24. A process for the production of triacylglycerol, comprising growing a 
transgenic cell or organism according to claims 1 6 to 23 under conditions 
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whereby the said nucleotide sequence according to claims 7 to 11 is 
expressed and whereby the said transgenic cells comprising an enzyme 
according to claims 1 to 6 catalysing the transfer of fatty acids from 
phospholipids to diacylglycerol forming triacyi glycerol. 

5 

25. Triacylglycerols produced by a process according to claim 24. 

26. Use of a nucleotide sequence according to claims 7 to 11 and/or an 
enzyme according to claims 1 to 6 for the production of triacylglycerol 

10 and/or triacylglycerols with uncommon fatty acids. 

27. Use of a nucleotide sequence according to claims 7 to 11 and/or an 
enzyme according to claims 1 to 6 for the transformation of any cell or 
organism in order to be expressed in this cell or organism and result in an 

15 altered, preferably Increased oil content of this cell or organism. 
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Abstract of the Disclosure 

The present invention relates to the isolation, identification and characterization 
5 of nucleotide sequences encoding an enzyme catalysing the transfer of fatty 
acids from phospholipids to diacylglycerol in the biosynthetic pathway for the 
production of triacylglyceroi, to the said enzymes and a process for the 
production of triacylglycerols. 
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Description of Figures 
FIG. 1. 

Metabolism of ^"^C-labeled PC into the neutral liDid fraction bv plant 
5 microsomes. (A) Microsomes from developing seeds of sunflower, R. 
communis and C. palaestina were incubated for 80 min at 30°C with PC (8 
nmol) having oleic acid in its sn-1 position, and either ^"^C-labeled oleic, 
ricinoleic or vernolic acid in its sn-2 position. Radioactivity incorporated in TAG 
(open bars), DAG (solid bars), and unsterified fatty acids (hatched bars) was 
10 quantified using thin layer chromatography followed by electronic 
autoradiography, and is shown as percentage of added labeled substrate. (B) 
Synthesis in vitro of TAG carrying two vernoloyi and one [^"^Clricinoleoyl group 
by microsomes from R. communis. The substrates added were unlabeled 
divernoloyl-DAG (5 nmol), together with either sn-1-oleoyl-sn-2-[^'^C]ricinoleoyl- 
15 DAG (0.4 nmol, 7700 dpm/nmol) or s/7-1-oleoyl-sn-2-[^'^C]ricinoleoyl-PC (0.4 
nmol, 7700 dpm/nmol). The microsomes were incubated with the substrates for 
30 min at SO^'C, after which samples were removed for lipid analysis as 
described in the section „general methods". The data shown are the average of 
two experiments. 

20 

FIG. 2. 

PDAT activity in veast microsomes, as visualized bv aut oradioaram of neutral 
lioid products separated on TLC . Microsomal membranes (10 nmol of PC) from 
the wild type yeast strain FY1679 (lanes 1-3), a congenic yeast strain 

25 (FVKT004-04C(AL)) that is disrupted for YNROOSw (lane 4) or the same 
disruption strain transformed with the plasmid pUSI , containing the YNROOSw 
gene behind its native promotor (lane 5), were assayed for PDAT activity. As 
substrates, we used 2 nmol sn-1-oIeoyl-sn-2-[^'^C]ricinoleoyl-PC together with 
either 5 nmol of dioleoyl-DAG (lanes 2, 4 and 5) or rac-oleoyl-vernoleoyl-DAG 

30 (lane 3). The enzymatic assay and lipid analysis was performed as described in 
Materials and Methods. The cells were precultured for 20 h in liquid YPD 
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medium, harvested and re-suspended in an equal volume of minimal medium 
(19) containing 16 g/l glycerol. The cells were then grown for an additional 24 h 
prior to being harvested. Selection for the plasmid was maintained by growing 
the transformed cells in synthetic medium lacking uracil (18). Abbreviations: 1- 
5 OH-TAG, monorlcinoleoyl-TAG; 1-OH-1-ep-TAG, monoricinoleoyl- 
monovernoloyl-TAG; OH-FA, unesterified ricinoleic acid. 

Fig. 3. 

Lipid content (A.B^ and PDAT activitv (C) in PDAT overexoressina v east cells. 

10 The PDAT gene in the plasmid pUS4 was overexpressed from the galactose- 
induced GAL1-TPK2 promoter in the wild type strain W303-1A (7). Its 
expression was induced after (A) 2 hours or (B) 25 hours of growth by the 
addition of 2% final concentration (w/v) of galactose. The cells were then 
incubated for another 22 hours before being harvested. The amount of lipids of 

15 the harvested cells was determined by GLC-analysis of its fatty acid contents 
and is presented as jjmol fatty acids per mg dry weight in either TAG (open 
bar), polar lipids (hatched bar), sterol esters (solid bar) and other lipids (striped 
bar). The data shown are the mean values of results with three independent 
yeast cultures. (C) In vitro synthesis of TAG by microsomes prepared from 

20 yeast cells containing either the empty vector (vector) or the PDAT plasmid (+ 
PDAT). The cells were grown as in Fig. 3A. The substrate lipids dioleoyl-DAG 
(2.5 nmol) and sn-1-oleoyI-SA7-2-[^*C]-oleoyl-PC (2 nmol) were added to aliquots 
of microsomes (10 nmol PC), which were then incubated for 10 min at 28 °C. 
The amount of label incorporated into TAG was quantified by electronic 

25 autoradiography. The results shown are the mean values of two experiments. 

FIG. 4. 

Substrate specificitv of veast PDAT. The PDAT activity was assayed by 
incubating aliquots of lyophilized microsomes (10 nmol PC) with substrate lipids 
30 at 30°C for 10 min (panel A) or 90 min (panel B). Unlabeled DAG (2.5 nmol) 
was used as substrates together with different labeled phospholipids, as shown 
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in the figure. (A) Sn-position specificity of yeast PDAT regarding tlie acyl donor 
substrate. Dioleoyl-DAG together with either sn-1-[^*C]oleoyl-sn-2-[^'^C]oleoyl- 
PC (di-L^'^q-PC), sn-1-[^*C]oleoyl-sn-2-oleoyl-PC {snA -[^"^CyPC) or sn-1-oleoyl- 
SA>2-[^*C]oleoyl-PC {sn2-[^^C]-PC). (B) Specificity of yeast PDAT regarding 
phospholipid headgroup and of the acyl composition of the phospholipid as well 
as of the diacyiglycerol. Dioleoyl-DAG together with either sn-1 -oleoyl-sn-2- 
[^^C]oleoyl-PC (oleoyl-PC), sn-1-oleoyl-sn-2-[^'^C]oleoyl-PE (oleoyl-PE), s/>1- 
oleoyl-sn-2-[^^C]rlcinoleoyl-PC (ricinoleoyl-PC) or sn-1-oleoyl-sn-2- 
[^"^Clvernoloyl-PC (vernoloyl-PC). In the experiments presented in the 2 bars to 
the far right, monoricinoleoyl-DAG (ricinoleoyl-DAG or mono-vernoloyl-DAG 
(vernoloyl-DAG) were used together with sn-1-oleoyl-sn-2-[^*C]-oleoyl-PC. The 
label that was incorporated into TAG (solid bars) and lyso-PG (LPG, open bars) 
was quantified by electronic autoradiography. The results shown are the mean 
values of two experiments. The microsomes used were from W303-1A cells 
overexpressing the PDAT gene from the GAL1-TPK2 promoter, as described in 
Fig. 3. The expression was Induced at early stationary phase and the cells were 
harvested after an additional 24 h. 



TAB.1: 

In vitro synthesis of triacvlalvcerols in microsomal preparations of developing 
castor bean. Aliqouts of microsomes (20 nmol PC) were lyophilised and 
substrate lipids were added in benzene solution: (A) 0.4 nmol [^"^Cj-DAG (7760 
dpm/nmol) and where indicated 1.6 nmol unlabelled DAG; (B) 0.4 nmol [^*C]- 
DAG (7760 dpm/nmol) and 5 nmol unlabelled di-ricinoleoyl-PC and (0) 0.25 
nmol [^"^Cl-PC (4000 dpm/nmol) and 5 nmol unlabelled DAG. The benzene was 
evaporated by N2 and 0.1 ml of 50 mM potassium phosphate was added, 
thoroughly mixed and incubated at 30 °C for (A) 20 min.; (B) and (C) 30 min.. 
Assays were terminated by extraction of the lipids In chloroform. The lipids 
were then separated by thin layer chromatography on silica gel 60 plates 
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(Merck; Darmstadt, Germany) in hexan/diethylether/acetic 35:70:1.5. The 
radioactive lipids were visualised and the radioactivity quantified on the plate by 
electronic autoradiography (Instant Imager, Packard, US). Results are 
presented as mean values of two experiments. 

5 

Radioactivity in different triacylglycerols (TAG) species formed. Abbreviations 
used: 1-OH-, mono-ricinoleoyi-; 2-OH, di-ricinoleoy1-; 3-0H-, triricinoleoyl; 1- 
OH-1 -ver-, mono-ricinoleoly-monovernoleoyi-; 1 -OH-2-ver-, mono-ricinoleoyl- 
divernoleoyl-. Radiolabelled DAG and PC were prepared enzymatically. The 
10 radiolabelled ricinoleoyi group is attached at the sn-2-position of the lipid and 
unlabelled oleoyi group at the sn-1 -position. Unlabelled DAG with vernoleoyi- or 
ricinoleoyi chains were prepared by the action of TAG lipase (6) on oil of 
Euphorbia lagascae or Castor bean, respectively. Synthetic di-ricinoleoyl-PC 
was kindly provided from Metapontum Agribios (Italy). 

15 

TAB.2: 

Total fatty acids per mg of T2 seeds pooled from individual Arabidopsis thaliana 
plants transformed with yeast PDAT gene under the control of napin promotor 
20 (26-1 4) or transformed with empty vector (32-4). 

* = stastistical difference between control plants and PDAT transformed plants 
in a mean difference two-sided test at a = 5. 
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Description of the SEQ ID: 

SEQ ID NO. 1: Genomic DNA sequence and suggested amino acid sequence of 
the Saccharomyces cerevisiae PDAT gene, YNROOSw, with GenBanl< accession 
5 number Z71 623 and Y1 31 39, and with nucleotide ID number 1 302481 . 

SEQ ID NO. 2. The amino acid sequence of the suggested open reading frame 
YNROOSw from Saccharomyces cerevisiae. 

10 SEQ ID NO. 3: Genomic DNA sequence of the Schizosacctiaromyces pombe gene 
SPBC776.14. 

SEQ ID NO. 4: Genomic DNA sequence of part of the Arabidopsis thaliana locus 
with GenBank accession number AB006704. 

15 

SEQ ID NO. 5: Nucleotide sequence of the Arabidopsis thaliana cDNA clone with 
GenBank accession number T04806, and nucleotide ID number 315966. 

SEQ ID NO. 6: Predicted amino acid sequence of the Arabidopsis thaliana cDNA 
20 clone with GenBank accession number T04806. 

SEQ ID NO. 7: Nucleotide and amino acid sequence of the Zea mays EST clone 
with GenBank accession number AI491339, and nucleotide ID number g4388167. 

25 SEQ ID NO. 8: Predicted amino acid sequence of the Zea mays EST clone with 
GenBank accession number AI491339, and nucleotide ID number g4388167. 

SEQ ID NO. 9. DNA sequence of part of the Neurospora crassa EST clone 
W07G1, with GenBank accession number AI398644, and nucleotide ID number 
30 g4241729. 
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SEQ ID NO. 10. Genomic DNA sequence of part of the Arabidopsis thaliana locus 
with GenBank accession number AC004557. 

SEQ ID NO. 11: Genomic DNA sequence of part of ttie Arabidopsis thaliana locus 
5 with GenBank accession number AC003027. 

SEQ ID NO. 12: DNA sequencce of part of the Lycopersicon esculentum cDNA 
clone with GenBank accession number AI486635. 

10 SEQ ID NO. 13: Amino acid sequence of the Scfiizosaccfiaromyces pombe 
putative open reading frame CAA22887 of the Scliizosaccharomyces pombe gene 
SPBG776.14. 

SEQ ID NO. 14: Amino acid sequence of the Arabidopsis ttialiana putative open 
15 reading frame AAG80628 derived from the Arabidopsis tfialiana locus with 
GenBank accession number AC004557. 

SEQ ID NO 15: Amino acid sequence of the Arabidopsis ttialiana putative open 
reading frame AAD10668 derived from the Arabidopsis thaliana locus with 
20 GenBank accession number AC003027. 

Further provisional and/or partial sequences are defined through the following 
SEQ IDs: 

25 SEQ ID NO. la: The amino acid sequence of the yeast ORF YNR008w from 
Saccharomyces cerevisiae. 

SEQ ID NO. 2a: Amino acid sequence of the region of the Arabidopsis thaliana 
genomic sequence (AG004557). 

30 



36 



SEQ ID NO. 3a: Amino acid sequence of the region of the Arabidopsis thaliana 
genomic sequence (AB006704). 

SEQ ID NO. 4a: The corresponding genomic DNA sequence and amino acid 
5 sequence of the yeast ORF YNROOSw from Saccharomyces cerevisiae. 

SEQ ID NO. 5a: The amino acid sequence of the yeast ORF YNROOSw from 
Saccharomyces cerevisiae derived form the corresponding genomic DNA 
sequence. 

10 

SEQ ID NO. 1b: Genomic DNA sequence of the Saccharomyces cerevisiae 
PDAT gene, YNROOSw, genebank nucleotide ID number 1302481, and the 
suggested YNROOSw amino acid sequence. 

15 

SEQ ID NO. 2b: The suggested amino acid sequence of the yeast gene 
YNROOSw from Saccharomyces cerevisiae. 

SEQ ID NO. 3b: Genomic DNA sequence of the Schizosaccharomyces pombe 
20 gene SPBC776.14. 

SEQ ID NO. 4b: Genomic DNA sequence of part of the Arabidopsis thaliana 
locus with genebank accession number AB006704. 

25 SEQ ID NO. 5b: Nucleotide sequence and the corresponding amino acid 
sequence of the Arabidopsis thaliana EST-clone with genebank accession 
number T04S06, and ID number 315966. 

SEQ ID NO. 6b: Nucleotide and amino acid sequence of the Zea mays cDNA 
30 clone with genebank ID number g43881 67. 
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SEQ ID NO. 7b: Amino acid sequence of tlie Zea mays cDNA clone witii 
genebank ID number g4388167. 

SEQ ID NO. 8b: DNA sequence of part of the Neurospora crassa cDNA clone 
5 W07G1 , ID number g4241729. 

SEQ ID NO. 9b: Genomic DNA sequence of part of the Arabldopsis thallana 
locus with genebank accession number AC004557. 

10 SEQ ID NO. 10b: Genomic DNA sequence of part of the Arabldopsis thallana 
locus with genebank accession number AC003027. 



15 



SEQ ID NO. 1 1b: DNA sequence of part of the Lycopersicon esculentum cDNA 
clone with genebank accession number AI486635. 
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Fig. 1: 
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Tab. 2: 



Tl plant T2 plant number nmol fatty acids per mg seed 
deviation 



32-4 1 
4 



5 
6 
7 
8 
9 

10 



standard 



1277 ±11 (n=2) 
1261 (n=3) 

1369 ±17 (n=3) 

1312 ±53 (n=4) 

1197 +54(n=5) 

1240 ±78 (ii=4) 

1283 ±54(n=5) 

1381 ±35 (n=5) 



1 1444 ±110 (n=4) 

26-14 1 ±109 (n=4) 

^ 1374 ±37 (n=2) 

5 1562* ±70(n=4) 

^ 1393 ±77 (n=4) 

7 1433 ±98 (n=4) 

8 1581* ±82(n=4) 
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Sequence Listing 

<210> 1 

<211> 1986 

<212> genoinic DNA 

<213> Saccliaroruyces cerevisiae 

<221> CDS 

<222> (1) . . (1983) 

<400> 1 

atg ggc aca ctg ttt cga aga aat gtc cag aac caa aag agt gat tct 48 

Met Gly Thr Leu Phe Arg Arg Asn Val Glri Asn Gin Lys Ser Asp Ser 
15 10 15 

gat gaa aac aat aaa ggg ggt tct gtt cat aac aag cga gag age aga 96 
Asp Glu Asn Asn Lys Gly Gly Ser Val His Asn Lys Arg Glu Ser Arg 
20 25 30 

aac cac att cat cat caa cag gga tta ggc cat aag aga aga agg ggt 144 
Asn His lie His His Gin Gin Gly Leu Gly His Lys Arg Arg Arg Gly 
35 40 45 

att agt ggc agt gca aaa aga aat gag cgt ggc aaa gat ttc gac agg 192 
lie Ser Gly Ser Ala Lys Arg Asn Glu Arg Gly Lys Asp Ptie Asp Arg 
50 55 60 

aaa aga gac ggg aac ggt aga aaa cgt tgg aga gat tec aga aga ctg 2 40 
Lys Arg Asp Gly Asn Gly Arg Lys Arg Trp Arg Asp Ser Arg Arg Leu 
65 70 75 80 

att tec att ctt ggt gca ttc tta ggt gta ctt ttg ccg ttt age ttt 2 88 
lie Pile lie Leu Gly Ala Phe Leu Gly Val Leu Leu Pro Phe Ser Phe 
85 90 95 

ggc get tat cat get eat aat age gat age gac ttg ttt gac aac ttt 33 6 
Gly Ala Tyr His Val His Asn Ser Asp Ser Asp Leu Phe Asp Asn Phe 
100 105 110 

gta aat ttt gat tea ctt aaa gtg tat ttg gat gat tgg aaa gat gtt 384 
Val Asn Phe Asp Ser Leu Lys Val Tyr Leu Asp Asp Trp Lys Asp Val 
115 120 125 

etc cea caa ggt ata agt teg ttt att gat gat att cag get ggt aac 432 
Leu Pro Gin Gly He Ser Ser Phe He Asp Asp He Gin Ala Gly Asn 
130 135 140 
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tac tec aca tct tct tta gat gat etc agt gaa aat ttt gcc gtt ggt 480 
Tyr Ser Thr Ser Ser Leu Asp Asp Leu Ser Glu Asn Phe Ala Val Gly 
145 150 155 160 

aaa caa etc tta cgt gat tat aat ate gag gee aaa cat cct gtt gta 528 
Lys Gin Leu Leu Arg Asp Tyr Asn lie Glu Ala Lys His Pro Val Val 
165 170 175 

atg gtt cct ggt gtc att tct acg gga att gaa age tgg gga gtt att 576 
Met Val Pro Gly Val lie Ser Thr Gly He Glu Ser Trp Gly Val He 
180 185 190 

gga gac gat gag tge gat agt tct gcg eat ttt cgt aaa egg ctg tgg 624 
Gly Asp Asp Glu Cys Asp Ser Ser Ala His Piie Arg Lys Arg Leu Trp 
195 200 205 

gga agt ttt tac atg ctg aga aca atg gtt atg gat aaa gtt tgt tgg 672 
Gly Ser Phe Tyr Met Leu Arg Thr Met Val Met Asp Lys Val Cys Trp 
210 215 220 

ttg aaa cat gta atg tta gat cct gaa aca ggt ctg gac cea ecg aac 720 
Leu Lys His Val Met Leu Asp Pro Glu Thr Gly Leu Asp Pro Pro Asn 
225 230 235 240 

ttt acg eta cgt gea gea eag ggc ttc gaa tea act gat tat ttc ate 768 
Phe Thr Leu Arg Ala Ala Gin Gly Phe Glu Ser Thr Asp Tyr Phe He 
245 250 255 

gea ggg tat tgg att tgg aac aaa gtt ttc caa aat ctg gga gta att 816 
Ala Gly Tyr Trp He Trp Asn Lys Val Phe Gin Asn Leu Gly Val He 
260 265 270 

ggc tat gaa ccc aat aaa atg 'acg agt get gcg tat gat tgg agg ctt 864 
Gly Tyr Glu Pro Asn Lys Met Thr Ser 'Ala Ala Tyr Asp Trp Arg Leu 
275 280 285 

gea tat tta gat eta gaa aga cgc gat agg tac ttt acg aag eta aag 912 
Ala Tyr Leu Asp Leu Glu Arg Arg Asp Arg Tyr Phe Thr Lys Leu Lys 
290 295 300 

gaa caa ate gaa ctg ttt cat caa ttg agt ggt gaa aaa gtt tgt tta 9 60 
Glu Gin He Glu Leu Phe His Gin Leu Ser Gly Glu Lys Val Cys Leu 
305 310 315 320 

att gga cat tct atg ggt tct cag att ate ttt tac ttt atg aaa tgg 1005 
He Gly His Ser Met Gly Ser Gin He He Phe Tyr Phe Met Lys Trp 
325 330 335 
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gtc gag get gaa ggc cct ctt tac ggt aat ggt ggt cgt ggc tgg gtt 1056 
Val Glu Ala Glu Gly Pro Leu Tyr Gly Asn Gly Gly Arg Gly Trp Val 
340 345 350 

aac gaa cac ata gat tea ttc att aat gca gca ggg acg ctt ctg ggc 1104 
Asn Glu His lie Asp Ser Phe lie Asn Ala Ala Gly Thr Leu Leu Gly 
355 360 365 

get cca aag gca gtt cca get eta att agt ggt gaa atg aaa gat ace 1152 
Ala Pro Lys Ala Val Pro Ala Leu lie Ser Gly Glu Met Lys Asp Thr 
370 375 380 

att caa tta aat acg tta gee atg tat ggt ttg gaa aag tte ttc tea 1200 
lie Gin Leu Asn Thr Leu Ala Met Tyr Gly Leu Glu Lys Phe Phe Ser 
385 390 395 400 

aga att gag aga gta aaa atg tta caa acg tgg ggt ggt ata cca tea 1248 
Arg lie Glu Arg Val Lys Met Leu Gin Thr Trp Gly Gly lie Pro Ser 
405 410 415 

atg eta cca aag gga gaa gag gtc att tgg ggg gat atg aag tea tct 1296 
Met Leu Pro Lys Gly Glu Glu Val lie Trp Gly Asp Met Lys Ser Ser 
420 425 430 

tea gag gat gea ttg aat aac aac act gac aca tac ggc aat ttc att 1344 
Ser Glu Asp Ala Leu Asn Asn Asn Thr Asp Thr Tyr Gly Asn Phe lie 
435 440 445 

cga ttt gaa agg aat acg age gat get tte aac aaa aat ttg aca atg 1392 
Arg Phe Glu Arg Asn Thr Ser Asp Ala Phe Asn Lys Asn Leu Thr Met 
450 455 460 

aaa gac gcc att aac atg aca tta teg ata tea cct gaa tgg etc caa 1440 
Lys Asp Ala lie Asn Met Thr Leu Ser lie Ser Pro Glu Trp Leu Gin 
465 470 475 480 

aga aga gta cat gag eag tac teg ttc ggc tat tee aag aat gaa gaa 1488 
Arg Arg Val His Glu Gin Tyr Ser Phe Gly Tyr Ser Lys Asn Glu Glu 
485 490 495 

gag tta aga aaa aat gag eta cac cac aag cac tgg teg aat cca atg 153 6 
Glu Leu Arg Lys Asn Glu Leu His Kis Lys His Trp Ser Asn Pro Met 
500 505 510 

gaa gta cca ctt cca gaa get ccc cac atg aaa ate tat tgt ata tac 1584 
Glu Val Pro Leu Pro Glu Ala Pro His Met Lys lie Tyr Cys lie Tyr 
515 520 525 
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g'tg aac aac cca act gaa agg gca tat gta tat aag gaa gag gat 1632 
Gly Val Asn Asn Pro Thr Glu Arg Ala Tyr Val Tyr Lys Glu Glu Asp 
530 535 540 

gac tec tct get ctg aat ttg acc ate gac tac gaa age aag caa cct 1680 
Asp Ser Ser Ala Leu Asn Leu Thr lie Asp Tyr Glu Ser Lys Gin Pro 
545 550 555 560 

gta ttc etc acc gag ggg gac gga acc gtt ccg etc gtg gcg cat tea 1728 
Val Phe Leu Thr Glu Gly Asp Gly Thr Val Pro Leu Val Ala His Ser 
565 570 575 

atg tgt cac aaa tgg gee cag ggt get tea ccg tac aac cct gee gga 1776 
Met Cys His Lys Trp Ala Gin Gly Ala Ser Pro Tyr Asn Pro Ala Gly 
580 585 590 

att aac gtt act att gtg gaa atg aaa cac cag cca gat cga ttt gat 1824 
lie Asn Val Thr lie Val Glu Met Lys His Gin Pro Asp Arg Phe Asp 
595 600 605 

ata cgt ggt gga gca aaa age gee gaa cac gta gac ate etc ggc age 1872 
lie Arg Gly Gly Ala Lys Ser Ala Glu His Val Asp He Leu Gly Ser 
610 615 620 

gcg gag ttg aac gat tac ate ttg aaa att gca age ggt aat ggc gat 1920 
Ala Glu Leu Asn Asp Tyr He Leu Lys He Ala Ser Gly Asn Gly Asp 
625 630 635 640 

etc gtc gag cca cgc caa ttg tct aat ttg age cag tgg gtt tct cag 19 68 
Leu Val Glu Pro Arg Gin Leu Ser Asn Leu Ser Gin Trp Val Ser Gin 
645 650 655 

atg ccc ttc cca atg taa 1986 
Met Pro Phe Pro Met 
660 
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<210> 2 
<211> 661 
<212> PRT 

<213> Saccharoinyces cerevisiae 
<400> 2 

Met Gly Thr Leu Phe Arg Arg Asn Val Gin Asn Gin Lys Ser Asp Ser 

15 10 15 

Asp Glu Asn Asn Lys Gly Gly Ser Val His Asn Lys Arg Glu Ser Arg 

20 25 30 

Asn His lie His His Gin Gin Gly Leu Gly His Lys Arg Arg Arg Gly 

35 40 ' 45 

lie Ser Gly Ser Ala Lys Arg Asn Glu Arg Gly Lys Asp Phe Asp Arg 

50 55 60 

Lys Arg Asp Gly Asn Gly Arg Lys Arg Trp Arg Asp Ser Arg Arg Leu 
65 70 75 80 

lie Phe lie Leu Gly Ala Phe Leu Gly Val Leu Leu Pro Phe Ser Phe 

85 90 95 

Gly Ala Tyr His Val His Asn Ser Asp Ser Asp Leu Phe Asp Asn Phe 

100 105 110 

Val Asn Phe Asp Ser Leu Lys Val Tyr Leu Asp Asp Trp Lys Asp Val 

115 120 125 

Leu Pro Gin Gly lie Ser Ser Phe lie Asp Asp lie Gin Ala Gly Asn 

130 135 140 

Tyr Ser Thr Ser Ser Leu Asp Asp Leu Ser Glu Asn Phe Ala Val Gly 
145 150 155 160 

Lys Gin Leu Leu Arg Asp Tyr Asn lie Glu Ala Lys His Pro Val Val 

165 170 175 

Met Val Pro Gly Val lie Ser Thr Gly lie Glu Ser Trp Gly Val lie 

180 ' 185 190 

Gly Asp Asp Glu Cys Asp Ser Ser Ala His Phe Arg Lys Arg Leu Trp 

195 200 205 

Gly Ser Phe Tyr Met Leu Arg Thr Met Val Met Asp Lys Val Cys Trp 

210 215 220 

Leu Lys His Val Met Leu Asp Pro Glu Thr Gly Leu Asp Pro Pro Asn 
225 230 235 240 

Phe Thr Leu Arg Ala Ala Gin Gly Phe Glu Ser Thr Asp Tyr Phe lie 

245 250 255 

Ala Gly Tyr Trp lie Trp Asn Lys Val Phe Gin Asn Leu Gly Val He 

260 265 270 

Gly Tyr Glu Pro Asn Lys Met Thr Ser Ala Ala Tyr Asp Trp Arg Leu 

275 280 285 

Ala Tyr Leu Asp Leu Glu Arg Arg Asp Arg Tyr Phe Thr Lys Leu Lys 

290 295 300 

Glu Gin He Glu Leu Phe His Gin Leu Ser Gly Glu Lys Val Cys Leu 
305 310 315 320 

He Gly His Ser Met Gly Ser Gin He He Phe Tyr Phe Met Lys Trp 
325 330 335 
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Val Glu Ala Glu Gly 
340 

Asn Glu His lie Asp 
355 

Ala Pro Lys Ala Val 
370 

lie Gin Leu Asn Thr 
385 

Arg lie Glu Arg Val 
405 

Met Leu Pro Lys Gly 
420 

Ser Glu Asp Ala Leu 
435 

Arg Phe Glu Arg Asn 
450 

Lys Asp Ala lie Asn 
465 

Arg Arg Val His Glu 
485 

Glu Leu Arg Lys Asn 
500 

Glu Val Pro Leu Pro 
515 

Gly Val Asn Asn Pro 
530 

Asp Ser Ser Ala Leu 
545 

Val Phe Leu Thr Glu 
565 

Met Cys His Lys Trp 
580 

lie Asn Val Thr lie 
595 

lie Arg Gly Gly Ala 
610 

Ala Glu Leu Asn Asp 
625 

Leu Val Glu Pro Arg 
645 

Met Pro Phe Pro Met 
660 



Pro Leu Tyr Gly Asn Gly 
345 

Ser Phe lie Asn Ala Ala 
360 

Pro Ala Leu lie Ser Gly 
375 

Leu Ala Met Tyr Gly Leu 
390 395 
Lys Met Leu Gin Thr Trp 
410 

Glu Glu Val He Trp Gly 
425 

Asn Asn Asn Thr Asp Thr 
440 

Thr Ser Asp Ala Phe Asn 
455 

Mei: Thr Leu Ser He Ser 
470 475 
Gin Tyr Ser Phe Gly Tyr 
490 

Glu Leu His His Lys His 
505 

Glu Ala Pro His Met Lys 
520 

Thr Glu Arg Ala Tyr Val 
535 

Asn Leu Thr He Asp Tyr 
550 555 
Gly Asp Gly Thr Val Pro 
570 

Ala Gin Gly Ala Ser Pro 
585 

Val Glu Met Lys His Gin 
600 

Lys Ser Ala Glu His Val 
615 

Tyr lie Leu Lys He Ala 
630 635 
Gin Leu Ser Asn Leu Ser 
650 



Gly Arg Gly Trp Val 
350 

Gly Thr Leu Leu Gly 
365 

Glu Met Lys Asp Thr 
380 

Glu Lys Phe Phe Ser 
400 

Gly Gly He Pro Ser 
415 

Asp Met Lys Ser Ser 
430 

Tyr Gly Asn Phe He 
445 

Lys Asn Leu Thr Met 
460 

Pro Glu Trp Leu Gin 
480 

Ser Lys Asn Glu Glu 
495 

Trp Ser Ksn Pro Met 
510 

He Tyr Cys He Tyr 
525 

Tyr Lys Glu Glu Asp 
540 

Glu Ser Lys Gin Pro 
560 

Leu Val Ala His Ser 
575 

Tyr Asn Pro Ala Gly 
590 

Pro Asp Arg Phe Asp 
605 

Asp He Leu Gly Ser 
620 

Ser Gly Asn Gly Asp 
640 

Gin Trp Val Ser Gin 
655 
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<210> 3 

<211> 2312 

<212> genomic DNA 

<213> Schizosaccharomyces pombe 



<400> 3 

ATGGCGTCTT CCAAGAAGAG CAAAACTCAT AAGAAAAAGA AAGAAGTCAA 50 

ATCTCCTATC GACTTACCAA ATTCAAAGAA ACCAACTCGC GCTTTGAGTG 100 

AGCAACCTTC AGCGTCCGAA ACACAATCTG TTTCAAATAA ATCAAGAAAA 150 

TCTAAATTTG GAAAAAGATT GAATTTTATA TTGGGCGCTA TTTTGGGAAT 2 00 

ATGCGGTGCT TTTTTTTTCG CTGTTGGAGA CGACAATGCT GTTTTCGACC 250 

CTGCTACGTT AGATAAATTT GGGAATATGC TAGGCTCTTC AGACTTGTTT 3 00 

GATGACATTA AAGGATATTT ATCTTATAAT GTGTTTAAGG ATGCACCTTT 3 50 

TACTACGGAC AAGCCTTCGC AGTCTCCTAG CGGAAATGAA GTTCAAGTTG 400 

GTCTTGATAT GTACAATGAG GGATATCGAA GTGACCATCC TGTTATTATG 450 

GTTCCTGGTG TTATCAGCTC AGGATTAGAA AGTTGGTCGT TTAATAATTG 5 00 

CTCGATTCCT TACTTTAGGA AACGTCTTTG GGGTAGCTGG TCTATGCTGA 550 

AGGCAATGTT CCTTGACAAG CAATGCTGGC TTGAACATTT AATGCTTGAT 600 

AAAAAAACCG GCTTGGATCC GAAGGGAATT AAGCTGCGAG CAGCTCAGGG 650 

GTTTGAAGCA GCTGATTTTT TTATCACGGG CTATTGGATT TGGAGTAAAG 7 00 

TAATTGAAAA CCTTGCTGCA ATTGGTTATG AGCCTAATAA CATGTTAAGT 750 

GCTTCTTACG ATTGGCGGTT ATCATATGCA AATTTAGAGG AACGTGATAA 800 

ATATTTTTGA AAGTTAAAAA TGTTCATTGA GTACAGCAAC ATTGTACATA 850 

AGAAAAAGGT AGTGTTGATT TCTCACTCCA TGGGTTCACA GGTTACGTAC 900 

TATTTTTTTA AGTGGGTTGA AGCTGAGGGC TACGGAAATG GTGGACCGAC 950 

TTGGGTTAAT GATCATATTG AAGCATTTAT AAATGTGAGT CTCGATGGTT 1000 
GTTTGACTAC GTTTCTAACT TTTGAATAGA TATCGGGATC TTTGATTGGA 1050 

GCACCCAAAA CAGTGGCAGC GCTTTTATCG GGTGAAATGA AAGATACAGG 1100 

TATTGTAATT ACATTAAACA TGTTAATATT TAATTTTTGC TAACCGTTTT 1150 

AAGCTCAATT GAATCAGTTT TCGGTCTATG GGTAAGCAAT AAATTGTTGA 1200 

GATTTGTTAC TAATTTACTG TTTAGTTTGG AAAAATTTTT TTCCCGTTCT 1250 

GAGGTATATT CAAAAATACA AATGTGCTCT ACTTTTTCTA ACTTTTAATA 1300 

GAGAGCCATG ATGGTTCGCA CTATGGGAGG AGTTAGTTCT ATGCTTCCTA 1350 

AAGGAGGCGA TGTTGTATGG GGAAATGCCA GTTGGGTAAG AAATATGTGC 1400 

TGTTAATTTT TTATTAATAT TTAGGCTCCA GATGATCTTA ATCAAACAAA 1450 

TTTTTCCAAT GGTGCAATTA TTCGATATAG AGAAGACATT GATAAGGACC 1500 

ACGATGAATT TGACATAGAT GATGCATTAC AATTTTTAAA AAATGTTACA 1550 

GATGACGATT TTAAAGTCAT GCTAGCGAAA AATTATTCCC ACGGTCTTGC 1600 
TTGGACTGAA AAAGAAGTGT TAAAAAATAA CGAAATGCCG TCTAAATGGA 165 0 

TAAATCCGCT AGAAGTAAGA ACATTAAAGT TACTAAATTA TACTAACCCA 1700 

AATAGACTAG TCTTCCTTAT GCTCCTGATA TGAAAATTTA TTGCGTTCAC 1750 

GGGGTCGGAA AACCAACTGA GAGAGGTTAT TATTATACTA ATAATCCTGA 1800 

GGGGCAACCT GTCATTGATT CCTCGGTTAA TGATGGAACA AAAGTTGAAA 185 0 

ATGTGAGAGA ATTTATGTTT CAAACATTCT ATTAACTGTT TTATTAGGGT 1900 

ATTGTTATGG ATGATGGTGA TGGAACTTTA CCAATATTAG CCCTTGGTTT 1950 

GGTGTGCAAT AAAGTTTGGC AAACAAAAAG GTTTAATCCT GCTAATACAA 2000 

GTATCACAAA TTATGAAATC AAGCATGAAC CTGCTGCGTT TGATCTGAGA 2050 

GGAGGACCTC GCTCGGGAGA ACACGTCGAT ATACTTGGAC ATTCAGAGCT 2100 

AAATGTATGT TCATTTTACC TTACAAATTT CTATTACTAA CTCTTGAAAT 2150 
AAGGAAATTA TTTTAAAAGT TTCATCAGGC CATGGTGACT CGGTACCAAA 2200 

CCGTTATATA TCAGATATCC AGTACGGACA TAAGTTTTGT AGATTGCAAT 2250 

TAACTAACTA ACCGAACAGG GAAATAATAA ATGAGATAAA TCTCGATAAA 23 00 
CCTAGAAATT AA 2 312 
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<210> 4 

<211> 3685 

<212> genomic DNA 

<213> Arabidopsis thaliana 

<400> 4 

ATGCCCCTTA TTCATCGGAA AAAGCCGACG 
ATCTGAAGAG GTGGTGCACG ATGAGGATTC 
CTTCCAAATC GCACCATAAG AAATCGAACG 
ATCGATTCTT GTTGTTGGTT CATTGGGTGT 
TCTTCTCTTC CTTTACAACG CAATGCCTGC 
CGGAGCGAAT CACGGGTCCT TTGCCTGACC 
AAAGAAGGTC TTAAGGCGAA ACATCCTGTT 
CACCGGTGGG CTCGAGCTTT GGGAAGGCAA 
TTAGAAAACG TTTGTGGGGT GGAACTTTTG 
GCTCAACAAT TCTCACTCTT CCTTTATATT 
TGAGATCACG CACTTGTTGC TTCTTCAACA 
TGTTTGTCTG TCTTACTCTT TACTTTTTTT 
ATTTTCTTAA GAGACTATTT CTGTATGTGT 
GTAATTGGCT TGGACTATTT CTGTTTGATT 
TAGCTGCCTT GGAATTTCAA GTCATCTTAT 
ATGCCCTAGA GTCCGTTCAT AACAAGTTAC 
GTAGATTTAG CTTTGTGTAG CGTATAATGA 
TGGGAATAGA GAAGTTCTAA CTACATCTGT 
GATAGAGGAC TGTTGCTTTA TTATTCAACT 
CTAGTTCCTT TTTGATCTTT CAGCTCAATG 
TCAATTTCAA AGTTTCACAT CGAGTTTATT 
CATCCTCGTT CTGTTATCCA GCTTTGAACT 
TATATTAAAA AAAAAGTGTT TTGTGGGTTG 
TCTTCTTCTT TCGGCTCAGT GTTCATGTTT 
AATGTTATTG TTGATGGTAA CAGTGGTATA 
TCAATTATCT CTTTGATTCA GGCCTCTATG 
TTGACAATGA AACTGGGTTG GATCCAGCTG 
TCAGGACTCG TGGCTGCTGA CTACTTTGCT 
AGTGCTGATT GCTAACCTTG CACATATTGG 
AGATGGCTGC ATATGACTGG CGGCTTTCGT 
TTCTCATCGT TCTTTCTATT ATTCTGTTCC 
TTACTTAAGG CTTAAATATG TTTCATGTTG 
GACTCTTAGC CGTATGAAAA GTAATATAGA 
GTGGAAAAAA AGCAGTTATA GTTCCGCATT 
CTACATTTTA TGAAGTGGGT TGAGGCACCA 
TGGGCCAGAT TGGTGTGCAA AGTATATTAA 
GACCATTTCT TGGTGTTCCA AAAGCTGTTG 
GCAAAGGATG TTGCAGTTGC CAGGTATTGA 
TGATCAGAAC CTTGGCTCTG GAACTCAAAG 
TCTAATAACA TTGCTATATT ATCGCTGCAA 
TTGCTGCTTA TGTAACTGAA ACTCTCTTGA 
GATAATTCTT ACGCATTGCT CTGTGATGAC 
TAACATTTGT CATACTGTCT TTTGGAGGGC 
GCGCTGGAGC TTCCATGCTT GCATTCTTTA 
TCTTTCAATT TTCTTGTATA TGCATCTATG 
TAAAGACTCG TTGGATTAGT TGGTCTATTA 
AGAACTTTAC TTTCTTCGAA AATTGCAGAG 
GACACCGATA TATTTAGACT TCAGACCTTG 
ACGCACATGG GACTCAACAA TGTCTATGTT 
TATGGGGCGG GCTTGATTGG TCACCGGAGA 
AAAAAGCAAA AGAACAACGA AACTTGTGGT 
TTCCAAGAAA AGTCCTGTTA ACTATGGAAG 
AAGTAGCAGA GGCTGCGCCA TCTGAGATTA 
AGGACATATA AATCATAATA AACCTTGTAC 
ATATCTGTAC ATTTTATCTG GTGAAGGGTG 
CCAAATCACA CCTGTCGTGA CGTGTGGACA 
TGCTGGGATC AAAGCTATCG CTGAGTATAA 



GAGAAACCAT 


CGACGCCGCC 




50 


GCAAAAGAAA 


CCACACGAAT 




100 


GAGGAGGGAA 


GTGGTCGTGC 




150 


GTGTGTGTAA 


CCTGGTGGTT 




200 


GAGCTTCCCT 


CAGTATGT2^ 




250 


CGCCCGGTGT 


TAAGCTCAAA 




300 


GTCTTCATTC 


CTGGGATTGT 




350 


ACAATGCGCT 


GATGGTTTAT 




400 


GTGAAGTCTA 


CAAAAGGTGA 




450 


GGGATTTGGA 


TTGGATCTGA 




500 


TCACTCAAAC 


TTTAATTCCA 




550 


TTTTTTTGAT 


GTGAAACGCT 




600 


AAGGTAAGCG 


TTCCAAGGAC 




650 


GTTAACTTTA 


GGATATAAAA 




700 


TGCCAAATCT 


GTTGCTAGAC 




750 


TTCCTTTACT 


GTCGTTGCGT 




800 


AGTAGTGTTT 


TATGTTTTGT 




850 


GGAAAGTGTG 


TTCAGGCTGT 




900 


ATGTATATGT 


GTAATTAAAG 




950 


TGCTTTTCTC 


AATTTTTTTC 


1000 




CACATGTCTT 


GAATTTCGTC 




1050 


CCTCCCGACC 


CTGCTATGGA 




1100 


CATCTTTGTT 


ACGATCTGCA 




1150 


TTGCTATGGT 


AGAGATGGGC 




1200 


GTTGATAGTA 


TCTTAACTAA 




1250 


TTGGGTGGAA 


CACATGTCAC 


1300 




GTATTAGAGT 


TCGAGCTGTA 


1350 




CCTGGCTACT 


TTGTCTGGGC 


1400 




ATATGAAGAG 


AAAAATATGT 


1450 




TTCAGAACAC 


AGAGGTTCTT 


1500 




ATGTTACGTT 


TCTTTCTTCA 


1550 




AATTAATAGG 


TACGTGATCA 




1600 


GTTGATGGTT 


TCTACCAACG 


1650 




CCATGGGGGT 


CTTGTATTTT 


1700 




GCTCCTCTGG 


GTGGCGGGGG 


1750 




GGCGGTGATG 


AACATTGGTG 


1800 




GAGGGCTTTT 


CTCTGCTGAA 


1850 




ATATCTGCTT 


ATACTTTTGA 


1900 




TTATTCTACT 


AAATATCAAT 


1950 




CTGACATTGG 


TTGATTATTT 


2000 




GATTAGACAA 


ATGATGAATT 


2050 




CAGTTTCTTA 


GCTTGGACGA 


2100 




ATTGAATTTT 


GCTATGGAAA 




2150 


CCAATTAGCG 


TTATTCTGCT 




2200 


GTCTTTTATT 


TCTTCTTAAT 




2250 


GTCACTTGGT 


TCCTTAATAT 




2300 


CGATTGCCCC 


AGGATTGTTA 




2350 


CAGCATGTAA 


TGAGAATGAC 




2400 


ACCGAAGGGA 


GGTGACACGA 




2450 


AAGGCCACAC 


CTGTTGTGGG 




2500 


GAAGCAGGTG 


AAAACGGAGT 




2550 


GATGATATCT 


TTTGGGAAAG 




2600 


ATAATATTGA 


TTTTCGAGTA 




2650 


ATTTTGTGAT 


TGTATGATGA 




2700 


CTGTCAAAGG 


TCAGAGTATC 




2750 


GAGTACCATG 


ACATGGGAAT 




2800 


GGTCTACACT 


GCTGGTGAAG 




2850 
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CTATAGATCT ACTACATTAT GTTGCTCCTA AGATGATGGC GCGTGGTGCC 2 900 

GCTCATTTCT CTTATGGAAT TGGTGATGAT TTGGATGACA CCAAGTATCA 2950 

AGATCCCAAA TACTGGTCAA ATCCGTTAGA GACAAAGTAA GTGATTTCTT 3000 

GATTCCAACT GTATCCTTCG TCCTGATGCA TTATCAGTCT TTTTGTTTTC 3 050 

GGTCTTGTTG GATATGGTTT TCAGCTCAAA GCTTACAAAG CTGTTTCTGA 3100 

GCCTTTCTCA AAAAGGCTTG CTCAGTAATA TTGAGGTGCT AAAGTTGATA 3150 

CATGTGACTC TTGCTTATAA ATCCTCCGTT TGGTTTGTTC TGCTTTTTCA 3200 

GATTACCGAA TGCTCCTGAG ATGGAAATCT ACTCATTATA CGGAGTGGGG 325 0 
ATACCAACGG AACGAGCATA CGTATACAAG CTTAACCAGT CTCCCGACAG 3300 

TTGCATCCCC TTTCAGATAT TCACTTCTGC TCACGAGGAG GACGAAGATA 3350 

GCTGTCTGAA AGCAGGAGTT TACAATGTGG ATGGGGATGA AACAGTACCC 3400 

GTCCTAAGTG CCGGGTACAT GTGTGCAAAA GCGTGGCGTG GCAAGACAAG 3450 

ATTCAACCCT TCCGGAATCA AGACTTATAT AAGAGAATAC AATCACTCTC 3 500 

CGCCGGCTAA CCTGTTGGAA GGGCGCGGGA CGCAGAGTGG TGCCCATGTT 3550 

GATATCATGG GAAACTTTGC TTTGATCGAA GATATCATGA GGGTTGCCGC 3 600 

CGGAGGTAAC GGGTCTGATA TAGGACATGA CCAGGTCCAC TCTGGCATAT 3 650 
TTGAATGGTC GGAGCGTATT GACCTGAAGC TGTGA 3685 
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<210> 5 
<211> 2427 
<212> cDNA 

<213> AraiDidopsis thaliana 



<400> 5 

AGAAACAGCT CTTTGTCTCT CTCGACTGAT 
TTCTAAATTC CTGGACGAGA TTTGACAAAG 
TTAATTTCAA GTGACAGATA TGCCCCTTAT 
AGAAACCATC GACGCCGCCA TCTGAAGAGG 
CAAAAGAAAC CACACGAATC TTCCAAATCC 
AGGAGGGAAG TGGTCGTGCA TCGATTCTTG 
TGTGTGTAAC CTGGTGGTTT CTTCTCTTCC 
AGCTTCCCTC AGTATGTAAC GGAGCGAATC 
GCCCGGTGTT AAGCTCAAAA AAAGAAGGTC 
GTCTTCATTC CTGGGATTGT CACCGGTGGG 
ACAATGCGCT GATGGTTTAT TTAGAAAACG 
GTGAAGTCTA CAAAAGGCCT CTATGTTGGG 
AATGAAACTG GGTTGGATCC AGCTGGTATT 
ACTCGTGGCT GCTGACTACT TTGCTCCTGG 
TGATTGCTAA CCTTGCACAT ATTGGATATG 
GCTGCATATG ACTGGCGGCT TTCGTTTCAG 
GACTCTTAGC CGTATGAAAA GTAATATAGA 
GTGGAAAAAA AGCAGTTATA GTTCCGCATT 
CTACATTTTA TGAAGTGGGT TGAGGCACCA 
TGGGCCAGAT TGGTGTGCAA AGTATATTAA 
GACCATTTCT TGGTGTTCCA AAAGCTGTTG 
GCAAAGGATG TTGCAGTTGC CAGAGCGATT 
CGATATATTT AGACTTCAGA CCTTGCAGCA 
CATGGGACTC AACAATGTCT ATGTTACCGA 
GGCGGGCTTG ATTGGTCACC GGAGAJlAGGC 
GCAAAAGAAC AACGAAACTT GTGGTGAAGC 
AGAAAAGTCC TGTTAACTAT GGAAGGATGA 
GCAGAGGCTG CGCCATCTGA GATTAATAAT 
CAAAGGTCAG AGTATCCCAA ATCACACCTG 
ACCATGACAT GGGAATTGCT GGGATCAAAG 
TACACTGCTG GTGAAGCTAT AGATCTACTA 
GATGGCGCGT GGTGCCGCTC ATTTCTCTTA 
ATGACACCAA GTATCAAGAT CCCAAATACT 
AAATTACCGA ATGCTCCTGA GATGGAAATC 
GATACGAACG GAACGAGCAT ACGTATACAA 
GTTGCATCCC CTTTCAGATA TTCACTTCTG 
AGCTGTCTGA AAGCAGGAGT TTACAATGTG 
CGTCCTAAGT GCCGGGTACA TGTGTGCAAA 
GATTCAACCC TTCCGGAATC AAGACTTATA 
CCGCCGGCTA ACCTGTTGGA AGGGCGCGGG 
TGATATCATG GGAAACTTTG CTTTGATCGA 
CCGGAGGTAA CGGGTCTGAT ATAGGACATG 
TTTGAATGGT CGGAGCGTAT TGACCTGAAG 
TTAAGCTGTC CTGTCAGCTT ATGTGAATCC 
ATCATCAATT CATCATGATC GTCATCATCA 
AGCCTGAGAA TGATACTTTG GTgCGAAATT 
CTTATTGAAT GTAAATTATA CAATCCTATC 
AAAACTTGCT GCNGCCATGT TTGTTTGTCT 
GGGTTAJVAAA AAAAAAAAAA AAAAAAA 



CTAACAATCC 


CTAATCTGTG 




50 


TCCGTATAGC 


TTAACCTGGT 




100 


TCATCGGAAA 


AAGCCGACGG 




150 


TGGTGCACGA 


TGAGGATTCG 




200 


CACCATAAGA 


AATCGAACGG 




250 


TTGTTGGTTC 


ATTGGGTGTG 




300 


TTTACAACGC 


AATGCCTGCG 




350 


ACGGGTCCTT 


TGCCTGACCC 




400 


TTAAGGCGAA 


ACATCCTGTT 


450 




CTCGAGCTTT 


GGGAAGGCAA 




500 


TTTGTGGGGT 


GGAACTTTTG 




550 


TGGAACACAT 


GTCACTTGAC 




600 


AGAGTTCGAG 


CTGTATCAGG 




650 


CTACTTTGTC 


TGGGCAGTGC 




700 


AAGAGAAAAA 


TATGTACATG 




750 


AACACAGAGG 


TACGTGATCA 




800 


GTTGATGGTT 


TCTACCAACG 




850 


CCATGGGGGT 


CTTGTATTTT 




900 


GCTCCTCTGG 


GTGGCGGGGG 




950 


GGCGGTGATG 


AACATTGGTG 


1000 




CAGGGCTTTT 


CTCTGCTGAA 




1050 


GCCCCAGGAT 


TCTTAGACAC 




1100 


TGTAATGAGA 


ATGACACGCA 




1150 


AGGGAGGTGA 


CACGATATGG 




1200 


CACACGTGTT 


GTGGGAAAAA 




1250 


AGGTGAAAAC 


GGAGTTTCCA 




1300 


TATCTTTTGG 


GAAAGAAGTA 




1350 


ATTGATTTTC 


GAGGTGCTGT 




1400 


TCGTGACGTG 


TGGACAGAGT 




1450 


CTATCGCTGA 


GTATAAGGTC 




1500 


CATTATGTTG 


CTCCTAAGAT 




1550 


TGGAATTGGT 


GATGATTTGG 




1600 


GGTCAAATCC 


GTTAGAGACA 




1650 


TACTCATTAT 


ACGGAGTGGG 




1700 


GCTTAACCAG 


TCTCCCGACA 




1750 


CTCACGAGGA 


GGACGAAGAT 




1800 


GATGGGGATG 


AAACAGTACC 




1850 


AGCGTGGCGT 


GGCAAGACAA 




1900 


TAAGAGAATA 


CAATCACTCT 




1950 


ACGCAGAGTG 


GTGCCCATGT 




2000 


AGATATCATG 


AGGGTTGCCG 




2050 


ACCAGGTCCA 


CTCTGGCATA 




2100 


CTGTGAATAT 


CA.TGATCTCT 




2150 


AATACTTTGA 


AAGAGAGATC 




2200 


TGATGCTCAA 


CTCACAAAGA 




2250 


CTCAATACCT 


CTTTAATATT 




2300 


TAATGTTTGA 


ACGATAACAC 




2350 


TGTCAAAAGC 


ATCAATTTGT 
2427 




2400 
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<210> 6 
<211> 671 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 6 

MPLIHRKKPT EKPSTPPSEE WHDEDSQKK 
IDSCCOTIGC VCVTWWFLLF LYNAMPASFP 
KEGLKAKHPV VPIPGIVTGG LELWEGKQCA 
LCWVEHMSLD NETGLDPAGI RVHAVSGLVA 
IGYEEKNMYM AAYDWRLSFQ NTEVRDQTLS 
VPHSMGVLYF LHFMKWVEAP APLGGGGGPD 
KAVAGLFSAE AKDVAVARAI APGFLDTDIF 
MLPKGGDTIW GGLDWSPEKG KTCCGKKQKt^ 
GRMISFGKEV AEAAPSEINN IDFRGAVKGQ 
GIKAIAEYKV YTAGEAIDLL HYVAPKMMAR 
PKYWSNPLET KLPNAPEMEI YSLYGVGIPT 
FTSAHEEDED SCLKAGVYNV DGDETVPVLS 
KTYIREYNHS PPANLLEGRG TQSGAHVDIM 
IGHDQVHSGI FEWSERIDLK L 



PHESSKSHHK 


KSNGGGKWSC 


50 


QYVTERITGP 


LPDPPGVKLK 


100 


DGLFRKRLWG 


GTFGEVYKRP 


150 


ADYFAPGYFV 


WAVLIANLAH 


200 


RMKSNIELMV 


STNGGKKAVI 


250 


WCAKYIKAVM 


NIGGPFLGVP 


300 


RLQTLQHVMR 


MTRTWDSTMS 


350 


NETCGEAGEN 


GVSKKSPVNY 


400 


SIPNHTCRDV 


WTEYHDMGIA 


450 


GAAHFSYGIA 


DDLDDTKYQD 


500 


ERAYVYKLNQ 


SPDSCIPFQI 


550 


AGYMCAKAm 


GKTRFNPSGI 


600 


GNFALIEDIM 


RVAAGGNGSD 


650 



671 
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<210> 7 
<211> 643 
<212> cDNA 
<213> Zea mays 

<221> CDS 

<222> (1) . . (402) 

<400> 6 

CGG GAG AAA ATA GCT GCT TTG AAG GGG GGT GTT TAG TTA GCC GAT GGT 48 
Arg Glu Lys lie Ala Ala Leu Lys Gly Gly Val Tyr Leu Ala Asp Gly 
15 10 15 

GAT GAA ACT GTT CCA GTT CTT AGT GCG GGC TAC ATG TGT GCG AAA GGA 9 6 
Asp Glu Thr Val Pro Val Leu Ser Ala Gly Tyr Met Cys Ala Lys Gly 
20 25 30 

TGG CGT GGC AAA ACT CGT TTC AGC CCT GCC GGC AGC AAG ACT TAC GTG 144 
Trp Arg Gly Lys Tiir Arg Pile Ser Pro Ala Gly Ser Lys Thr Tyr Val 
35 40 45 



AGA GAA TAC AGC CAT TCG CCA CCC TCT 
Arg Glu Tyr Ser His Ser Pro Pro Ser 
50 55 



ACC CAG AGC 
Thr Gin Ser 
65 

GAG GAC GTC 
Glu Asp Val 



GGC GAT CAG GTT TAT TCA GAT ATA TTC 
Gly Asp Glxi Val Tyr Ser Asp lie Phe 
100 105 



ACT CTC CTG GAA GGC AGG GGC 192 
Thr Leu Leu Glu Gly Arg Gly 
60 



AAG TGG TCA GAG AAA ATC AAA 33 6 
Lys Trp Ser Glu Lys lie Lys 
110 



GGT GCA CAT GTT GAT ATA ATG GGG AAC TTT GCT CTA ATT 240 

Gly Ala His Val Asp lie Met Gly Asn Phe Ala Leu lie 

70 75 80 

ATC AGA ATA GCT GCT GGG GCA ACC GGT GAG GAA ATT GGT 288 

lie Arg lie Ala Ala Gly Ala Thr Gly Glu Glu lie Gly 

85 90 95 



TTG AAA TTG TAA CCTATGGGAA GTTAAAGAAG TGCCGACGCG TTTATTGCGTTCC 391 
Leu Lys Leu 
115 

AAAGTGTCCT GCCTGAGTGC AACTCTGGAT TTTGCTTAAA TATTGTAATT TTTCACGC 449 
TTCATTCGTC CCTTTGTCAA ATTTACATTT GACAGGACGC CAATGCGATA CGATGTTG 507 
TACCGCTATT TTCAGCATTG TATATTAAAC TGTACAGGTG TAAGTTGCAT TTGCCAGC 565 
TGAAATTGTG TAGTGGTTTT CTTTACGATT TAATANCAAG TGGCGGAGCA GTGCCCCA 623 
AGCNAAAAAA AAAAAAAAAA 643 
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<210> 8 
<211> 115 
<212> PRT 
<213> Zea mays 

<400> 8 

Arg Glu Lys lie Ala Ala Leu Lys Gly Gly Val Tyr Leu Ala Asp Gly 
15 10 15 

Asp Glu Thr Val Pro Val Leu Ser Ala Gly Tyr Met Cys Ala Lys Gly 
20 25 30 



Trp Arg Gly Lys Tlir Arg Phe Ser Pro Ala Gly Ser Lys Thr Tyr Val 
35 40 45 



Arg Glu Tyr Ser His Ser Pro Pro Ser Thr Leu Leu Glu Gly Arg Gly 
50 55 60 



Thr Gin Ser Gly Ala His Val Asp lie Met Gly Asn Phe Ala Leu lie 
65 70 75 80 



Glu Asp Val lie Arg lie Ala Ala Gly Ala Thr Gly Glu Glu He Gly 
85 90 95 



Gly Asp Gin Val Tyr Ser Asp He Phe Lys Trp Ser Glu Lys He Lys 
100 105 110 



Leu Lys Leu 
115 
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<210> 9 
<211> 616 
<212> cDNA 

<213> Neurospora crassa 
<400> 9 

ggtggcgaag acganggcgg aagttggagg 
agatggatct accctctaga gacacgacta 
gtntacngtt tntatgggta ggaagccgac 
tggcgcccga tcccgggacg acaacgcatc 
actttgactn aggggcacat tgaccacggt 
tggcacagtg aaccttatga gtttggggta 
aaatgaagag atacaatcct gcgggctcaa 
ccgcatgaac cagaacggtt caatccgaga 
tcacgtggat attctaggaa ggcagaatct 
tggcggcagg tcgaggcgat acaattgagg 
cttaaatatg tagaaaaggt tgaaatttat 
acataggtta ctcaatagta tgactaatta 
aaaaaaaaaa aaaaaa 







50 


ccnttgcacc 


cagcctcaag 


100 


ggagcgagcG 


tacatctatc 


150 


tttagatgac 


gatcgatacg 


200 


gtgattttgg 


gcgaaggcga 


250 


cctgtgcaat; 


aaggggtgga 


300 


aaataaccgt 


ggtcgagatg 


350 


ggagggccga 


atacggcgga 


400 


aaacgagtac 


attcttaaag 


450 


attttattac 


tagtaatatt 


500 


gaagagtaat 


taaatacggc 


550 


aaaaaaaatt 


ttttttctaa 


600 



616 
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<210> 10 

<211> 1562 

<212> genomic DNA 

<213> Arabidopsis thaliana 

<400> 10 

ATGAAAAAAA TATCTTCACA TTATTCGGTA 
GGTGACGATG ACCTCGATGT GTCAAGCTGT 
TGATTCTGGT TCCAGGAAAC GGAGGTAACC 
AGAGAATACA AGCCAAGTAG TGTCTGGTGT 
TCATAAGAAG AGTGGTGGAT GGTTTAGGCT 
TATTGTCTCC CTTCACCAGG TGCTTCAGCG 
GACCCTGATT TGGATGATTA CCAAAATGCT 
TCCTCATTTC GGTTCGACCA AATCACTTCT 
GGTTAGTACT TTCCAAGATA TATCATTTTG 
AAATAGACAT AAATTTGGGG GATTATTGTT 
GCTAGTCGGT AATGTGAGTG TTATGTTAGT 
GTGATTTTCC ATTTTAAATG AAGCTAGAAA 
CTATGTCATG AGAATTATAA GGACACTATG 
GGTTTGATTT GCAGAGATGC CACATCTTAC 
TCTAGAGAAA AAATGCGGGT ATGTTAACGA 
CATATGATTT CAGGTACGGC CTGGCTGCTT 
GCCTCACAGT TCCTACAAGA CCTCAAACAA 
CGAGAACGAA GGAAAGCCAG TGATACTCCT 
TTTTCGTCCT CCATTTCCTC A?ICGGTACCA 
TACATCAAAC ACTTTGTTGC ACTCGCTGCG 
TCAGATGAAG ACATTTGCTT CTGGCAACAC 
ACCCTTTGCT GGTCAGACGG CATCAGAGGA 
CTACTTCCAT CTACCAAAGT GTTTCACGAC 
AACTCCCCAG GTTAACTACA CAGCTTACGA 
ACATTGGATT CTCACAAGGA GTTGTGCCTT 
TTAACAGAGG AGCTGATGAC TCCGGGAGTG 
GAGAGGAGTT GATACACCGG AGGTTTTGAT 
ATAAGCAACC AGAGATTAAG TATGGAGATG 
GCGAGCTTAG CAGCTTTGAA AGTCGATAGC 
TGGAGTTTCG CATACATCTA TACTTAAAGA 
TTATGAAGCA GATTTCAATT ATTAATTATG 
GTCAATGAAT GA 



GTCATAGCGA 


TACTCGTTGT 


50 




GGGTAGCAAC 


GTGTACCCTT 




100 


AGCTAGAGGT 


ACGGCTGGAC 




150 


AGCAGCTGGT 


TATATCCGAT 




200 


ATGGTTCGAT 


GCAGCAGTGT 




250 


ATCGAATGAT 


GTTGTACTAT 




300 


CCTGGTGTCC 


AAACCCGGGT 




350 


ATACCTCGAC 


CCTCGTCTCC 




400 


GGACATTTGC 


ATAATGAACA 




450 


ATATCAATAT 


CCATTTATAT 




500 


ATAGTTAATG 


TGAGTGTTAT 




550 


GTTGTCGTTT 


AATAATGTTG 




600 


TAAATGTAGC 


TTAATAATAA 




650 


ATGGAACATT 


TGGTGAAAGC 




700 


CCAAACCATC 


CTAGGAGCTC 




750 


CGGGCCACCC 


GTCCCGTGTA 




800 


TTGGTGGAAA 


AAACTAGCAG 




850 


CTCCCATAGC 


CTAGGAGGAC 




900 


CCCCTTCATG 


GCGCCGCAAG 




950 


CCATGGGGTG 


GGACGATCTC 




1000 


ACTCGGTGTC 


CCTTTAGTTA 




1050 


CCTCCGAGAG 


TAACCAATGG 




1100 


AGAACTAAAC 


CGCTTGTCGT 




1150 


GATGGATCGG 


TTTTTTGCAG 




1200 


ACAAGACAAG 


AGTGTTGCCT 




1250 


CCAGTCACTT 


GCATATATGG 




1300 


GTATGGAAAA 


GGAGGATTCG 




1350 


GAGATGGGAC 


GGTTAATTTG 




1400 


TTGAACACCG 


TAGAGATTGA 




1450 


CGAGATCGCA 


CTTAAAGAGA 




1500 


AATTAGCCAA 


TGTTAATGCC 




1550 



1562 
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<210> 11 

<211> 3896 

<212> genomic DNA 

<213> Arabidopsis thaliana 

<400> 11 

ATGGGAGGGA ATTCGAAATC AGTAACGGCT 
TTTTTTCTTG ATTTGCGGTG GCCGAACTGC 
TTCACGGCGA CTACTCGAAG CTATCGGGTA 
TCGACGCAGC TACGAGCGTG GTCGATCCTT 
GGACTTCAAT CCGCTCGACC TCGTATGGCT 
ATCTTCATTT CCTTCGCTCC TTATTCTGTC 
AATTCCAAGC GAAATATAGC AATGAAGCAT 
TCGTTCATTA GTCAACAGTG ACGCTTCTGA 
AAAACAGCTG ACTCGGCGAG TGTTTCCCAT 
TAGCGGAATG AATGTGTAAT TAGTCTGCGC 
CAAGTTTTTC AGAGTGCTCA ATAGTAGTTA 
CTTGTGCATT GTGATTCTTT TGGTTGTTGC 
TGGTTTACAG CTTCTTTCTG CTGTCAACTG 
TAGATCCTTA TAATCAAACA GACCATCCCG 
AGTGGTCTTT CAGCCATCAC AGAATTGGAT 
TTTCGGATTT TTCTTTCTTT TGAGTTTTCT 
TGTGATATAA TATGGCTAAG TTCATTAATT 
TTTCTACTGT CTGGAAAGAG TGGCTTAAGT 
GAAGCAAATG CAATTGTCGC TGTTCCATAC 
CAAATTGGAA GAGCGTGACC TTTACTTTCA 
TCAGGCTAAT GTCTTTTATC TTCTCTTTTT 
TCTGGTCGTC TTCCTTTTTG CAGGTTGACC 
CCGTGGCGGC CCTTCTATAG TATTTGCCGA 
TCAGATACTT TCTGGAATGG CTGAGGCTAG 
TTGAAGTGGC TTGATCAGCA TATCCATGCT 
CCTACTATCC TTAAGTTACC Aa?TTTATTTT 
TGTTGTGACT TACTGGATTG AGCTCGATAC 
GAGCTCCTCT TCTTGGTTCT GTTGAGGCAA 
GTAACGTTTG GCCTTCCTGT TTCTGAGGTG 
TTTAAGTAGT TGATATCAAC CAGGTCTTAT 
TGAAAGTATT ACTTTTGTTA ATTGAACTGC 
TAGATCTTGA AGTGCTAGTT ATCAAAGAAC 
GTCAGCGGCC TTAGCTAATA CAACCAAACC 
TTCAGATTAT TATGGTAGAC TTTAAGTTGA 
CTTTTTATTT TAATAGGCTA TGATTTGTTT 
TGACATGCGC TTCTCATGTT TTTTGTTGGC 
GGTTGTTGTC CAATTCTTTT GCGTCGTCAT 
AAGAATTGCA AGGGTGATAA CACATTCTGG 
TGCAAAGAAA GATAAGCGCG TATACCACTG 
CAAAATATTC TGGCTGGCCG ACAAATATTA 
ACTAGCGGTT AGACTCTGTA TATGCAACTG 
CCAAGAATGT TCACTCTCAT ATTTCGTTCC 
TACAGAAACA GCTCTAGTCA ACATGACCAG 
CCCTTTTGTC TTTCACAGCC CGTGAACTAG 
GCAATAGAAG ACTATGACCC AGATAGCAAG 
GAAGTACGTA CCTTTCTTTG TGATAAGAAA 
TTGCTGGCTT CTTGTACGTC AAATTGTTTT 
TGTTCATATG CTTTGTCTTT CTTACTATAA 
CCTTATTATT GATTATCAGT TCTCTCCTTA 
GTTTACAGTT ATGAATGCAA AAGGGGGTAT 
ATTCTCTAGT TTGTTTTGAC TAATAGCGTC 
TCTTTGTGAA TTATATATAA CATGCTAACT 
TGATGACCCT GTTTTTAATC CTCTGACTCC 
AAAATGTATT TTGCATATAT GGTGCTCATC 
ATTCTCAATA TCACATTATG CGTTGACTTT 
GTTTGCAATA TCTTTTTGAA TTATGATTTA 
GCTATTAAGC GTTAAAGGTA CTAAATGTAT 





TPATPGCCGT 


50 




GAG AC CGAGT 


100 




ppp A T TT PP P 


150 




APAPTPPGTT 


200 




A A PPTPPPTP 






PTTPTTPATP 

^ X X vj X X WjTi X VJ 


J w u 


X V_ X ^VjX V 


TPTTATTPAT 

X X X Jri. X X X 


o _} u 


±\ X W X VjiAVj X X i 


APAPTPATAT 


*i U U 


\^^^_ X X X X VJO X 


TPPPTAAATC 


dsn 


X X 1 1 


APT APATPTP 


J u u 




PPTP A TTTT A 

X \_>i A X X X X A 


-J _) u 


TTAPTHATPP 


A PPTP ATPGA 


V vJ \J 


CVCZCZ^T^ A ACl 


TPTATPPTPP 

XvjXriX vjrtjx 


U -> V 


A niTY^T A A PfP 


APPPPPTPAP 


7 nn 


PP AP.PITT APA 


TA APAPPT AP 




TP A ATTTPAT 


ATP ATPTTPT 
riX^-r^XV^X Xv^X 


o u u 


TPPT'P A A TTT 

X vcrO X *^jrlri X X JL 


TPAPPTPPTP 


O J u 


Vjvj X X ^ X 1 Njr/i. 


^ X X X v^^vjr X iri X 




PATTPPAPAT 

Vji-i.X X ^Vji^Vcrra 1 


TPTP APP A AP 


-7 _J U 


PA APPTPAAP 


TTAPTPPTTA 
1 XrWjX^— v^X Xri. 


X U V \J 


A nnf^T' A A n A T A 


AflPTA ArZAPr' 


XU 3 U 


'v^Tcz AAA r"vrz 


prpmrpA A A APT 
X X X Zo-Tlrirt^ X 


linn 

-L X U U 


TT^P A ATPPPT" 
X 1 ^jrt-ci 1 X 


A A T A A TPTPT 


1 1 sn 




a A A A P A TT A T 


xz fju 


m TV m rp m /"I /*t m 


TTPPT APPPP 


X Z D U 


X X I— . X X£\±\X X 


fiPPPPAPTTA 


XO u u 


PTP A T^TT'P'T'T' 
V_ X <J£± X 1 X vjr X X 


PTTP A TTT A P 

<j X X Vj<ri XXX 


X J Z) u 


TPAAATPTAP 


TPTPTPTPPT 

X \— X X ^ X VJVJ X 


1400 


i^V^v- X V,, ± X 


TPTPTTT APT 
X X X X X r^Vj X 




A APTP APTPP 


7\ rpmmfp PP mrprn 
iA. X X X X XXX 


X J \J \J 


TPTAPPPPAT' 


ATPPT ATPTP 
>i X X X ^ X\j 


X-J D U 


£\Xr\X XVjXvjo-vj 


TAPTATAPPT 


X U U \J 


APATPTAPAP 


TPATTTAPTT 

X \ XXX f^vi? X X 


1650 


P A Ar* A A APT'T 


TP APTPA A AT 


1 vnn 

X / U U 


ATTPAAATPA 


TPTPAPATAT 


1750 


A APPPTTP AP 


PP A APTPPTP 


1 ftnn 

JL O \J U 


TPTPPP T TAT 

XvjXVJV3^X Xi^i 


PPP ATTTTCA 


1850 


APPP ATTTTT 




1900 


TPATPAAPAP 

X O^ra X \jiT«rt.vjjtavj 


PAATATPAAT 


X ^ ,J u 


TTAAPATTPA 


AATTPPTTPP 


? nn n 

^ u vj u 


TA APAPTAAP 


AAA APTTTP A 


2050 


X X XvJrtXvjXVJX 


ATPPATPAGT 


2100 


PATPPAATPT 


PPPPTTPPCA 


2150 


P APATPPPAP 


TPTTTTP AAA 

X ^ X XX X V^JirtA 


Z( U VJ 


APP A TPTT AP 


AP P APT T AAA 


22 50 


T A TTPPTP AT 
X rl X X i v—i-i X 


PP ATP ATP AP 


Z O vj U 


GTTTAAATCT 


CTATATCAAT 


2350 


GAAACAAGTA 


TAATCAGAAA 


2400 


TATTATGGAA 


TGTCTTTTTC 


2450 


TTTAGTTGAT 


TGATTCTCTC 


2500 


AATTTTGTTT 


TTCTAGCAAA 


2550 


ATACTTTTCA 


GGTTGTATCA 


2600 


TTGGGAGAGA 


CCACCTATAA 


2650 


TAAAGACAGA 


GGTATGATGC 


2700 


GTTATTATAT 


TCCCCATTTG 


2750 


TCTTCTCGCT 


TGCATCTTAT 


2800 


GAAGCTGTCT 


GTCATAGGTT 


2850 
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GGTTATTACT TTGCCCCAAG TGGCAAACCT 
CACGGATATC ATTTATGAAA CTGAAGGTTC 
CCGCAATGGC AGAAGTAAAA CAGGAAGGCA 
GTGGCATGTT ATCTCAGTTG CATAAGCAAA 
AAGTACTTTT TTATCATTCC TTTTGAGCTT 
AAGTGGGAAG AGGTGTTGCA TGAAACATGA 
AGCAAAACAA AACTAACCCA TTTCTGAATT 
GTGCTTTTAA AAAATTTGTT TTAAGAAACC 
GATTGTGCAA TATCTGCAGG TCTGGAACTG 
CCTATAACTG GGGATGAGAC GGTAAGCTCA 
CTTCTTGCAA ACTACTGAAG ACTAAGATAA 
CTTGCTATGT TCTCTAGTAC ACTGCAATAT 
TGATTATGAA ATTGATCTCT TATAGGTACC 
GCAAGAATTG GCTCGGACCT AAAGTTAACA 
CTCTTTTTTA GTTCCTCACC TTATATAGAT 
CTGGTTATGT GTTGATTTAC CTCCAATTTG 
TCTCTGTACT CCTCAAGAAC TTGTATTAAT 
GAAAATAAAA CAACAGCCAG AACACGATGG 
TAAATGTTGA TCATGAGCAT GGGTCAGACA 
GCACCAAGGG TTAAGTACAT AACCTTTTAT 
GGGGAAGAGA ACCGCAGTCT GGGAGCTTGA 



TATCCTGATA ATTGGATCAT 2900 

CCTCGTGTCA AGGTAATTTT 2950 

AAGTCTTCTG TATCAGTCTA 3000 

TTATTAAACA ACTAAAATTT 3050 

AGTGGATGAT CAGTGGCTTA 3100 

CACTTGTATC AAAGATAACT 3150 

TCATATTATT AGGAGTAGTC 3200 

GAAAAACTAG TTCATATCTT 3250 
TGGTTGATGG GAACGCTGGA 3300 

GAAGTTGGTT TTGAAATTAT 3350 

TACTTGCTTC TGGAACACTG 3400 

TGACTCTCCG CTACTTTTAT 3450 

CTATCATTCA CTCTCTTGGT 3500 

TAACAATGGC TCCCCAGGTA 3550 

CAAACTTTAA GTGTACTTTT 3 60 0 

TTCTTTCTAA AAATCATATA 3 650 

CTAAACGAGA TTCTCATTGG 3700 

AAGCGACGTA CATGTGGAAC 3750 

TCATAGCTAA CATGACAAAA 3800 

GAAGACTCTG AGAGCATTCC 3850 
TAAAAGTGGG TATTAA 3 896 
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<210> 12 
<211> 709 
<212> cDKA 

<213> Lycopersicori esculentmn 
<400> 12 

CTGGGGCCAA AAGTGAACAT AACAAGGACA 
TCAGATGTAC AAGTGCATCT AAATATAGAG 
CATTCCCAAT ATGACAAAGT TACCTACAAT 
AGGATTCTGA AAGTTTTCCA GGGACAAGAA 
AAAGCAAATC ACAGGAACAT TGTCAGATCT 
GTGGCTTGAG ATGTGGCATG ATATTCATCC 
TTACAAAAGG TGGTGTCTGA TCCTCACTAT 
GTTTGTATTG ACATTGTAAG TATTGCAACA 
TGAGGGATGA GGACTGCTAT TGGGATTACG 
GCTGAACATT GTGAATACAG GTTAGAATAT 
TATTCTCTTT TTGTGTATTT AGGCCACCTT 
GATATGTATT CGGGGATGTT CACCTGGGAC 
TCTACATCTC ACATCCTGTC ACACTATGTG 
TTGGCGGAAC AACAAGTTTG CACAAACATT 

TTCAGAGAG 



CCACAGTCAG 


AGCATGATGT 


50 


CATCAACATG 


GTGAAGATAT 


100 


GAAGTACATA 


ACCTATTATG 


150 


CAGCAGTTTG 


GGAGCTTGAT 


200 


CCAGCTTTGA 


TGCGGGAGCT 


250 


TGATAAAAAG 


TCCAAGTTTG 


300 


TTTCTTCTAT 


AAATGTTTGA 


350 


AAAAGCAAAG 


CGTGGGCCTC 


400 


GGAAAGCTCG 


ATGTGCATGG 


450 


TCAAATTATA 


TTTTGCAAAA 


500 


TCCCCGGTCA 


CAACGATGCA 


550 


AGAGTTGCAG 


ATTGAAGAGT 


600 


TGATATTTAA 


GAAACTTTGT 


650 


TGAAGAAGAA 


AGCGAAATGA 


700 



709 
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<210> 13 
<211> 623 
<212> PRT 

<213> Schizosaccharomyces poiabe 
<400> 13 

lyTASSKKSKTHKKXKETs/KSPIDLPNSKKPTRALSEQPSASETQSVSl^S^ 70 
FFFAVGDDNAWDPATLDKFGNMLGSSDLFDDIKGYLSYNWKDAPFTTDKPSQSPSGNEVQVGLDM^ 140 

GYRSDHPVIMVPGVISSGLESWSFNNCS IPYFKKRLWGSWSMLKAMFLDKQCWL 2 10 

KljRAAQGFEAAIDFFITGYWIWSKVIEIsniiAAIGYEPNISnyi 280 

IWKKKVVLISKSMGSQVTYYFFKWA^AEGYGNGGPTWVISfDHIEA^ 3 50 
TGIVITLNILEKFFSRSERMMVRTMGGVSSMLPKGGDVAPDDLNQTNFSNGAIIRYREDIDKDHDEFDI 420 

DDALQFLKWTDDDFK?yMLAKNYSHGLAWTEKEVLKNNEM 490 

ERGYYYTNNPEGQPVIDSSVNDGTKVENGIVMDDGDGTLPILALGLVCNKWQT?^ 5 60 

HEPAAFDLRGGPRSAEHVDILGHSELNEIILKVSSGHGDSVPNRYISDIQEIINEIlSrLDKPRN 623 
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<210> 14 ^ 
<211> 432 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 14 

MKKISSHYSWIAILWVTMTSMCQAVGSIWYPLILVPGNGGNQLEVRLDREYKPSSWCSSWLYPIHK^ 7 0 
SGGWPRLWFD AAVLLS PFTRCF SDRMMLY YDPDLDDYQNAPGVQTRVPHFGSTKSLLYLDPRLRDATSYM 140 
EHLVKALEKKCGYVNDQTILGAPYDFRYGLAASGHPSRVASQFLQDLKQLVEKTSSENEGKPVILLSHS 210 
GGLFVLHFLmTTPSWRKKYIKHFVALAAPWGGTISQMKTFASGNTLGVPLWPLLVRRHQRTSESNQWL 280 
LPSTKVFHDRTKPLVWPQVHYTAYEmRFFACIGFSQGWPYKTRVLPLTEELMTP^^ 350 
TPEVLMYGKGGFDKQPEIKYGDGDGTWLASLAALKVDSLNTVEIDGVSHTSIIiro^ 42 0 

NYELANVNAVNE 432 
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<210> 15 
<211> 552 
<212> PRT 

<213> A2rabidopsis thaliana 
<400> 15 

MGANSKSVTASFTVIAVFFLICGGRTAVSDETEFHGDYSKLSGIIIPGFASTQLRAWSILDCPYTPLDFN 7 0 
PLDLVWLDTTKLLSAWCWFKCMVLDPYNQTDHPECKSRPDSGLSAITELDPGYITGPLSTVWKEWL^ 140 
VEFGIEANAIVAVPYDWRLSPTKLEERDL YFHKIiKLTFETALKI.RGGPS I VFAHSMGNWFRYFLEWLRL 210 
EIAPKHYLKWLDQHIHAYFAVGAPLLGSVEAIKSTLSGVTFGLPVSEGTARLLSNSFASSLWLMPFSKNC 280 
KGDNTFWTHFSGGAAKKDKRVYHCDEEEYQSKYSGWPTNI INIEIPSTSARELADGTLFKAIEDYDPDSK 350 
RlX[LHQLKKYVPFFVIRNIAHRSSLAGFLLYHDDPVFNPLTPWERPPIKl>r^/F 420 
KPYPDNWI I TD 1 1 YSTEGSLVSRSGTWDGNAGP ITGDETVP YHS L S WCKNWLGPKVKITMAPQ IL I GKI 490 
KQQ PEHDGS DVHVELNVDHEHGSDI I ANMTKAPRVKY I TF YED SE S I PGKRT AVWELDKSGY 552 
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10 



15 



<170> Patentin Ver. 2.0 



<210> lOu 
<211> 661 
20 <212> PRT 

<213> Saccharomyces cerevisiae 

<400> 1 

Met Gly Thr Leu Phe Arg Arg Asn Val Gin Asn Gin hys Ser Asp Bar 
25 1 5 10 15 

Asp Glu Asn Asn Lys Gly Gly Ser Val His Asn Lys Arg Glu Ser Arg 
20 25 30 

30 Asn His lie His His Gin Gin Gly Leu Gly His Lys Arg Arg Arg Gly 
35 40 45 

lie Ser Gly Ser Ala Lys Arg Asn Glu Arg Gly Lys Asp Phe Asp Arg 
50 55 60 

35 

Lys Arg Asp Gly Asn Gly Arg Lys Arg Trp Arg Asp Ser Arg Arg Leu 
65 70 75 80 

lie Phe lie Leu Gly Ala Phe Leu Gly Val Leu Leu Pro Phe Ser Phe 
40 85 - 90 95 

Gly Ala Tyr His Val His Asn Ser Asp Ser Asp Leu Phe Asp Asn Phe 
100 105 110 

45 Val Asn Phe Asp Ser Leu Lys Val Tyr Leu Asp Asp Trp Lys Asp Val 
115 120 125 

Leu Pro Gin Gly He Ser Ser Phe He Asp Asp He Gin Ala Gly Asn 
130 135 140 

50 

Tyr Ser Thr Ser Ser Leu Asp Asp Leu Ser Glu Asn Phe Ala Val Gly 
145 150 155 160 

Lys Gin Leu Leu Arg Asp Tyr Asn He Glu Ala Lys His Pro Val Val 
55 165 170 175 

Met Val Pro Gly Val He Ser Thr Gly He Glu Ser Trp Gly Val He 
180 185 190 

60 Gly Asp Asp Glu Cys Asp Ser Ser Ala His Phe Arg Lys Arg Leu Trp 
195 200 205 
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10 



25 



40 



55 



Gly Ser Phe Tyr Met Leu Arg Thr Met Val Met Asp Lys Val Cys Trp 
210 215 220 

Leu Lys His Val Met Leu Asp Pro Glu Thr Gly Leu Asp Pro Pro Asn 
225 230 235 240 

Phe Thr Leu Arg Ala Ala Gin Gly Phe Glu Ser Thr Asp Tyr Phe lie 
245 250 255 

Ala Gly Tyr Trp He Tirp Asn Lys Val Phe Gin Asn Leu Gly Val He 
260 265 270 



Gly Tyr Glu Pro Asn Lys Ket Thr Ser Ala Ala Tyr Asp Trp Arg Leu 
15 275 280 285 

Ala Tyr Leu Asp Leu Glu Arg Arg Asp Arg Tyr Phe Thr Lys Leu Lys 
290 295 ' 300 

20 Glu Gin He Glu Leu Phe His Gin Leu Ser Gly Glu Lys Val Cys Leu 
305 310 315 320 



He Gly His Ser Met Gly Ser Gin He He Phe Tyr Phe Met Lys Trp 
325 330 335 

Val Glu Ala Glu Gly Pro Leu Tyr Gly Asn Gly Gly Arg Gly Trp Val 
340 345 350 



Asn Glu His He Asp Ser Phe He Asn Ala Ala Gly Thr Leu Leu Gly 
30 355 360 365 

Ala Pro Lys Ala Val Pro Ala Leu He Ser Gly Glu Met Lys Asp Thr 
370 375 380 

35 He Gin Leu Asn Thr Leu Ala Met Tyr Gly Leu Glu Lys Phe Phe Ser 

385 390 395 400 



Arg He Glu Arg Val Lys Met Leu Gin Thr Trp Gly Gly He Pro Ser 
405 410 415 

Met Leu Pro Lys Gly Glu Glu Val He Trp Gly Asp Met Lys Ser Ser 

420 4'25 430 



Ser Glu Asp Ala Leu Asn Asn Asn Thr Asp Thr Tyr Gly Asn Phe He 

45 435 440 445 

Arg Phe Glu Arg Asn Thr Ser Asp Ala Phe Asn Lys Asn Leu Thr Met 

450 455 460 

50 Lys Asp Ala He Asn Met Thr Leu Ser He Ser Pro Glu Trp Leu Gin 

465 470 475 480 



Arg Arg Val His Glu Gin Tyr Ser Phe Gly Tyr Ser Lys Asn Glu Glu 

485 4S0 495 

Glu Leu Arg Lys Asn Glu Leu His His Lys His Torp Ser Asn Pro Met 
500 505 510 



Glu Val Pro Leu Pro Glu Ala Pro His Met Lys He Tyr Cys He Tyr 
60 515 520 525 
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Gly Val Asn Asn Pro Thr Glu Arg Ala Tyr Val Tyr Lys Glu Glu Asp 
530 535 540 

Asp Ser Ser Ala Leu Asn Leu Thr lie Asp Tyr Glu Ser Lys Gin Pro 
5 545 550 555 560 

Val Phe Leu Thr Glu Gly Asp Gly Thr Val Pro Leu Val Ala His Ser 
565 570 575 

10 Met Cys His Lys Trp Ala Gin Gly Ala Ser Pro Tyr Asn Pro Ala Gly 
5S0 585 590 



15 



He Asn Val Thr He Val Glu Met Lys His Gin Pro Asp Arg Phe Asp 
595 600 605 

He Arg Gly Gly Ala Lys Ser Ala Glu His Val Asp He Leu Gly Ser 
610 ' 615 620 



Ala Glu Leu Asn Asp Tyr He Leu Lys He Ala Ser Gly Asn Gly Asp 
20 625 630 635 640 

Leu Val Glu Pro Arg Gin Leu Ser Asn Leu Ser Gin Trp Val Ser Gin 
645 650 655 

25 Met Pro Phe Pro Met 
660 

<210> 20. 
30 <211> 387 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 2 

35 Val Gly Ser Asn Val Tyr Pro Leu He Leu Val Pro Glv Asn Gly Gly 
15 10 15 

Asn Gin Leu Glu Val Arg Leu Asp Arg Glu Tyr Lys Pro Ser Ser Val 
20 25 30 

40 

Trp Cys Ser Ser Trp Leu Tyr Pro He His Lys Lys Ser Gly Gly Trp 
35 40 45 

Phe Arg Leu Trp Phe Asp Ala Ala Val Leu Leu Ser Pro Phe Thr Arg 
45 50 55 60 

Cys Phe Ser Asp Arg Met Met Leu Tyr Tyr Asp Pro Asp Leu Asp Asp 
65 70 75 80 

50 Tyr Gin Asn Ala Pro Gly Val Gin Thr Arg Val Pro His Phe Gly Ser 

85 90 95 

Thr Lys Ser Leu Leu Tyr Leu Asp Pro Arg Leu Arg Asp Ala Thr Ser 
100 105 110 

55 

Tyr Met Glu His Leu Val Lys Ala Leu Glu Lys Lys Cys Gly Tyr Val 
115 120 125 

Asn Asp Gin Thr He Leu Gly Ala Pro Tyr Asp Phe Arg Tyr Gly Leu 
60 130 135 140 
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Ala Ala Ser Gly His Pro Ser Arg Val Ala Ser Gin Phe Leu Gin Asp 
145 150 155 160 

Leu Lys Gin Leu Val Glu Lys Thr Ser Ser Glu Asn Glu Gly Lys Pro 
5 165 170 175 

Val He Leu Leu Ser Kis Ser Leu Gly Gly Leu Phe Val Leu Kis Phe 
180 185 190 

10 Leu Asn Arg Thr Thr Pro Ser Trp Arg Arg Lys Tyr He Lys His Phe 
195 200 205 



15 



30 



45 



Val Ala Leu Ala Ala Pro Trp Gly Gly Thr He Ser Gin Met Lys Thr 
210 215 220 

Phe Ala Ser Gly Asn Thr Leu Gly Val Pro Leu Val Asn Pro Leu Leu 

225 230 235 240 



Val Arg Arg His Gin Arg Thr Ser Glu Ser Asn Gin Trp Leu Leu Pro 
20 245 250 255 

Ser Thr Lys Val Phe His Asp Arg Thr Lys Pro Leu Val Val Thr Pro 
260 265 270 

25 Gin Val Asn Tyr Thr Ala Tyr Glu Met Asp Arg Phe Phe Ala Asp He 
275 280 285 



Gly Phe Ser Gin Gly Val Val Pro Tyr Lys Thr Arg Val Leu Pro Leu 

290 295 300 

Thr Glu Glu Leu Met Thr Pro Gly Val Pro Val Thr Cys He Tyr Gly 

305 310 315 320 



Arg Gly Val Asp Thr Pro Glu Val Leu Met Tyr Gly Lys Gly Gly Phe 

35 325 330 335 

Asp Lys Gin Pro Glu He Lys Tyr Gly Asp Gly Asp Gly Thr Val Asn 
340 345 350 

"40 Leu Ala Ser Leu Ala Ala Leu Lys Val Asp Ser Leu Asn Thr Val Glu 
355 360 365 



He Asp Gly Val Ser His Thr Ser He Leu Lys Asp Glu He Ala Leu 
370 375 380 

Lys Glu He 
385 



50 <210> 3^ 
<211> 389 
<212> PRT 

<213> Arabidopsis thaliana 
55 <400> 3 

Leu Lys Lys Glu Gly Leu Lys Ala Lys His Pro Val Val Phe He Pro 
15 10 15 

Gly He Val Thr Gly Gly Leu Glu Leu Trp Glu Gly Lys Gin Cys Ala 
60 20 25 30 
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Asp Gly Leu Phe Arg hys Arg Leu Trp Gly Gly Thr Phe Leu Cys Trp 
35 40 45 

Val Glu His Met Ser Leu Asp Asn Giu Thr Gly Leu Asp Pro Ala Gly 
5 50 55 60 

lie Arg Val Arg Ala Val Ser Gly Leu Val Ala Ala Asp Tyr Phe Ala 
65 70 75 80 

10 Pro Gly Tyr Phe Val Trp Ala Val Leu lie Ala Asn Leu Ala His He 

85 90 95 



15 



Gly Tyr Glu Glu Lys Asn Met Tyr Met Ala Ala Tyr Asp Trp Arg Leu 
100 105 110 

Ser Phe Gin Asn Thr Glu Arg Asp Gin Thr Leu Ser Arg Met Lys Ser 
115 120 125 



Asn He Glu Leu Met Val Ser Thr Asn Gly Gly Lys Lys Ala Val He 
20 130 135 140 

Val Pro His Ser Meu Gly Val Leu Tyr Phe Leu His Phe Met Lys Trp 
145 150 155 160 

25 Val Glu Ala Pro Ala Pro Leu Gly Gly Gly Gly Gly Pro Asp Trp Cys 

165 170 175 

Ala Lys Tyr He Lys Ala Val Met A.sn He Gly Gly Pro Phe Leu Gly 
ISO 185 190 

30 

Val Pro Lys Ala Val Ala Gly Leu Phe Ser Ala Glu Ala Lys Asp Met 
195 200 205 

Arg Met Thr Arg Thr Trp Asp Ser Thr Met Ser Met Leu Pro Lys Gly 
35 210 215 220 

Gly Asp Thr He Trp Gly Gly Leu Asp Trp Ser Pro Glu Leu Pro Asn 
225 230 235 240 

40 Ala Pro Glu Met Glu He Tyr Ser Leu Tyr Gly Val Gly He Pro Thr 

245 250 255 



45 



Glu Arg Ala Tyr Val Tyr Lys Leu Asn Gin Ser Pro Asp Ser Cys He 
260 265 270 

Pro Phe Gin He Phe Thr Ser Ala His Glu Glu Asp Glu Asp Ser Cys 

275 280 285 



Leu Lys Ala Gly Val Tyr Asn Val Asp Gly Asp Glu Thr Val Pro Val 
50 290 295 300 

Leu Ser Ala Gly Tyr Met Cys Ala Lys Ala Trp Arg Gly Lys Thr Arg 
305 310 315 320 

55 Phe Asn Pro Ser Gly He Lys Thr Tyr He Arg Glu Tyr Asn His Ser 

325 330 335 



60 



Pro Pro Ala Asn Leu Leu Glu Gly Arg Gly Thr Gin Ser Gly Ala His 
340 345 350 

Val Asp He Met Gly Asn Phe Ala Leu He Glu Asp He Met Arg Val 
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355 360 365 

Ala Ala Gly Gly Asn Gly Ser Asp He Gly His Asp Gin Val His Ser 



5 



370 

Gly He Phe Glu Trp 
385 



375 



10 <210> 4€^ 
<211> 1986 
<212> DNA 

<213> Saccharomyces cerevisxae 

15 <220> 

<221> CDS 

<222> (1) . * (1983) 

<400> 4 , ^^rr a.at aat tct 48 



20 llTJc aca ctg ttt cga aga aa. gtc cag aac caa aag agt gat tct 
Met Gly Thr Leu Plie Arg- Arg Asn Val Gin Asn c^±n uy 
1 5 10 

as S "n Jin SI IS S i S Sn S S SS IE s 

ill s m s: ss s-^ 5 s s ss 

30 35 40 

K ffi El S SS S K i K ^p^^ S S 

50 5^ 

s s III i| s Ss s 5s s s s s ^| 
- s lii s i| ss s s i| 
K s jjs If. s IS K s; s pj JiS ^S'. 

100 

f.; L'.^ IS L*^ r» S2 Si 12 s Sp i "i s ?^ 
I- S5 IS SI sS re? IS jr, s s fi is e is 

130 135 
tac tec aca tct tct tta gat gat etc agt gaa aat ttt gcc gtt ggt 
Tyr Ser Thr Ser Ser Leu Asp Asp Leu Ser Glu Asn pne 
145 150 155 

60 aaa caa etc tta cgt gat tat aat ate gag gee aaa eat eet gtt gta 
Lys Gin Leu Leu Arg Asp Tyr Asn lie Glu Ala Lys iiis r 



45 



50 115 



96 



144 



192 



288 



336 



3 84 



432' 



523 
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10 



30 



50 



165 170 175 

atg gtt cct ggt gtc att tct acg gga att gaa age tgg gga gtt att S76 
Met Val Pro Gly Val lie Ser T3ar Gly lie Glu Ser Trp Gly Val He 
180 185 190 

gga gac gat gag tgc gat agt tct gcg cat ttt cgt aaa egg ctg tgg 624 
Gly Asp Asp Glu Cys Asp Ser Ser Ala His Phe Arg Lys Arg Leu Trp 
195 200 205 

gga agt ttt tac atg ctg aga aca atg gtt atg gat aaa gtt tgt tgg 672 
Gly Ser Phe Tyr Met Leu Arg Thr Met Val Met Asp Lys Val Cys Trp 
210 215 220 



15 ttg aaa cat gta atg tta gat cct gaa aca ggt ctg gac cca ccg aac 72 0 
Leu Lys His Val Met Leu Asp Pro Glu Thr Gly Leu Asp Pro Pro Asn 
225 230 235 240 

ttt acg eta cgt gca gca cag ggc ttc gaa tea act gat tat ttc ate 768 
20 Phe Thr Leu Arg Ala Ala Gin Gly Phe Glu Ser Thr Asp Tyr Phe He 

245 250 255 

gca ggg tat tgg att tgg aac aaa gtt ttc caa aat ctg gga gta att SI 6 
Ala Gly Tyr Trp He Trp Asxi Lys Val Phe Gin Asn Leu Gly Val He 
25 260 265 270 

ggc tat gaa ccc aat aaa atg acg agt get gcg tat gat tgg agg ctt 864 
Gly Tyr Glu Pro Asn Lys Met Thr Ser Ala Ala Tyr Asp Trp Arg Leu 
275 280 285 



gca tat tta gat eta gaa aga cgc gat agg tac ttt acg aag eta aag 912 
Ala Tyr Leu Asp Leu Glu Arg Arg Asp Arg Tyr Phe Thr Lys Leu Lys 
290 295 300 



35 gaa caa ate gaa ctg ttt cat caa ttg agt ggt gaa aaa gtt tgt tta 960 
Glu Gin He Glu Leu Phe His Gin Leu Ser Gly Glu Lys Val Cys Leu 
305 310 315 320 

att gga cat tct atg ggt tct cag att ate ttt tac ttt atg aaa tg-g 1008 
40 He Gly His Ser Met Gly Ser Gin He He Phe Tyr Phe Met Lys Trp 

325 330 335 

gtc gag get gaa ggc cct ctt tac ggt aat ggt ggt cgt ggc tgg gtt 1056 
Val Glu Ala Glu Gly Pro Leu Tyr Gly Asn Gly Gly Arg Gly Trp Val 
45 340 345 350 

aac gaa cac ata gat tea ttc att aat gca gca ggg acg ctt ctg ggc 1104 
Asn Glu His He Asp Ser Phe He Asn Ala Ala Gly Thr Leu Leu Gly 
355 360 365 



get cca aag gca gtt cca get eta att agt ggt gaa atg aaa gat ace 1152 
Ala Pro Lys Ala Val Pro Ala Leu He Ser Gly Glu Met Lys Asp Thr 
370 375 380 



55 att caa tta aat acg tta gee atg tat ggt ttg gaa aag ttc ttc tea 12 00 
He Gin Leu Asn Thr Leu Ala Met Tyr Gly Leu Glu Lys Phe Phe Ser 
385 390 395 400 

aga att gag aga gta aaa atg tta caa acg tgg ggt ggt ata cca tea 12 48 
60 Arg He Glu Arg Val Lys Met Leu Gin Thr Trp Gly Gly He Pro Ser 

405 410 415 



35/59 



atg eta cca aag gga gaa gag gtc att tgg ggg gat atg aag tea tct 129 6 

Met Leu Pro Lys Gly Gin Glu Val lie Trp Gly Asp Met Lys Ser Ser 
420 425 430 

5 

tea gag gat gca ttg aat aac aac act gac aca tac ggc aat ttc att 1344 
Ser Glu Asp Ala Leu Asrx Asn Asn Thr Asp Thr Tyr Gly Asn Phe lie 
435 440 445 

10 cga ttt gaa agg aat acg age gat get ttc aac aaa aat ttg aca atg 1392 
Arg Phe Glu Arg Asn Thr Ser Asp Ala Phe Asn Lys Asn Leu Thr Met 
450 455 460 

aaa gac gcc att aac atg aca tta teg ata tea cct gaa tgg etc caa 1440 
15 Lys Asp Ala lie Asn Met Thr Leu Ser Xle Ser Pro Glu Trp Leu Gin 
465 470 475 480 

aga aga gta cat gag cag tac teg ttc ggc tat tec aag aat gaa gaa 1488 
Arg Arg Val His Glu Gin Tyr Ser Phe Gly Tyr Ser Lys Asn Glu Glu 
20 485 490 495 

gag tta aga aaa aat gag eta cac cac aag cac tgg teg aat cca atg 153 6 
Glu Leu Arg Lys Asn Glu Leu His His Lys His Trp Ser Asn Pro Met 
500 505 510 



25 



45 



gaa gta cca ctt cca gaa get ccc cac atg aaa ate tat tgt ata tac 15 84 
Glu Val Pro Leu Pro Glu Ala Pro His Met Lys He Tyr Cys lie Tyr 
515 520 525 



30 ggg gtg aac aac cca act gaa agg gca tat gta tat aag gaa gag gat 1632 
Gly Val Asn Asn Pro Thr Glu Arg Ala Tyr Val Tyr Lys Glu Glu Asp 
530 535 540 

gac tec tct get ctg aat ttg acc ate gac tac gaa age aag caa cct 1680 
35 Asp Ser Ser Ala Leu Asn Leu Thr lie Asp Tyr Glu Ser Lys Gin Pro 
545 550 555 560 

gta ttc etc acc gag ggg gac gga acc gtt ccg etc gtg gcg cat tea 172 8 
Val Phe Leu Thr Glu Gly Asp Gly Thr Val Pro Leu Val Ala His Ser 
40 565 570 575 

atg tgt cac aaa tgg gee cag ggt get tea ccg tac aac cct gcc gga 177 6 
Met Cys His Lys Trp Ala Gin Gly Ala Ser Pro Tyr Asn Pro Ala Gly 
580 585 590 



att aac gtt act act gtg gaa atg aaa cac cag cca gat cga ttt gat 1824 
lie Asn Val Thr He Val Glu Met Lys His Gin Pro Asp Arg Phe Asp 
595 600 605 



50 ata cgt ggt gga gca aaa age gcc gaa cac gta gac ate etc ggc age 1872 
He Arg Gly Gly Ala Lys Ser Ala Glu His Val Asp He Leu Gly Ser 
610 615 620 

gcg gag ttg aac gat tac ate ttg aaa att gca age ggt aat ggc gat 192 0 
55 Ala Glu Leu Asn Asp Tyr He Leu Lys He Ala Ser Gly Asn Gly Asp 
625 630 635 640 

etc gtc gag cca cge caa ttg tct aat ttg age cag tgg gtt tct cag 1968 
Leu Val Glu Pro Arg Gin Leu Ser Asn Leu Ser Gin Trp Val Ser Gin 
60 645 650 655 
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atg ccc ttc cca atg taa 1986 
Met Pro Phe Pro Met 
660 

5 

<210> 5 
<211> 661 
<212> PRT 

<213> Saccharomyces cerevisiae 

10 

<400> 5 

Met Gly Thr Leu Phe Arg Arg Asn Val Gin Asn Gin Lys Ser Asp Ser 
15 10 15 

15 Asp Glu Asu Asn Lys Gly Gly Ser Val His Asn Lys Arg Glu Ser Arg 
20 25 30 



20 



Asn His He His His Gin Gin Gly Leu Gly His Lys Arg Arg Arg Gly 
35 40 45 

He Ser Gly Ser Ala Lys Arg Asn Glu Arg Gly Lys Asp Phe Asp Arg 
50 55 60 



Lys Arg Asp Gly Asn Gly Arg Lys Arg Trp Arg Asp Ser Arg Arg Leu 

25 65 70 75 80 

lie Phe lie Leu Gly Ala Phe Leu Gly Val Leu Leu Pro Phe Ser Phe 

85 90 95 

30 Gly Ala Tyr His Val His Asn Ser Asp Ser Asp Leu Phe Asp Asn Phe 
100 105 110 



35 



Val Asn Phe Asp Ser Leu Lys Val Tyr Leu Asp Asp Trp Lys Asp Val 

115 120 125 

Leu Pro Gin Gly lie Ser Ser Phe He Asp Asp He Gin Ala Gly Asn 

130 135 140 



Tyr Ser Thr Ser Ser Leu Asp Asp Leu Ser Glu Asn Phe Ala Val Gly 
40 145 ISO 155 160 

Lys Gin Leu Leu Arg Asp Tyr Asn He Glu Ala Lys His Pro Val Val 
165 170 175 

45 Met Val Pro Gly Val He Ser Thr Gly He Glu Ser Trp Gly Val He 
180 185 190 

Gly Asp Asp Glu Cys Asp Ser Ser Ala His Phe Arg Lys Arg Leu Trp 
195 200 205 

50 

Gly Ser Phe Tyr Met Leu Arg Thr Met Val Met Asp Lys Val Cys Trp 
210 215 220 

Leu Lys His Val Met Leu Asp Pro Glu Thr Gly Leu Asp Pro Pro Asn 
55 225 230 235 240 

Phe Thr Leu Arg Ala Ala Gin Gly Phe Glu Ser Thr Asp Tyr Phe He 
245 250 255 

60 Ala Gly Tyr Trp He Trp Asn Lys Val Phe Gin Asn Leu Gly Val He 
260 265 270 
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10 



Gly Tyr Glu Pro Asn Lys Met Thr Ser Ala Ala Tyr Asp Trp Arg Leu 
275 280 285 

Ala Tyr Leu Asp Leu Glu Arg Arg Asp Arg Tyr Phe Thr Lys Leu Lys 
290 295 300 

Glu Gin He Glu Leu Phe His Gin Leu Ser Gly Glu Lys Val Cys Leu 
305 310 315 320 

He Gly His Ser Met Gly Ser Gin He He Phe Tyr Phe Met Lys Trp 
325 330 335 



Val Glu Ala Glu Gly Pro Leu Tyr Gly Asn Gly Gly Ara Gly Trp Val 
15 340 345 350 

Asn Glu His He Asp Ser Phe He Asn Ala Ala Gly Thr Leu Leu Gly 
355 360 365 

20 Ala Pro Lys Ala Val Pro Ala Leu He Ser Gly Glu Met Lys Asp Thr 
370 375 380 



25 



40 



55 



He Gin Leu Asn Thr Leu Ala Met Tyr Gly Leu Glu Lys Phe Phe Ser 

385 390 395 400 

Arg He Glu Arg Val Lys Met Leu Gin Thr Trp Gly Gly He Pro Ser 

405 410 415 



Met Leu Pro Lys Gly Glu Glu Val He Trp Gly Asp Met Lys Ser Ser 
30 420 425 430 

Ser Glu Asp Ala Leu Asn Asn Asn T2ir Asp Thr Tyr Gly Asn Phe He 
435 440 445 

35 Arg Phe Glu Arg Asn Thr Ser Asp Ala Phe Asn Lys Asn Leu Thr Met 
450 455 460 



Lys Asp Ala He Asn Met Thr Leu Ser He Ser Pro Glu T2rp Leu Gin 

465 470 475 480 

Arg Arg Val His Glu Gin Tyr Ser Phe Gly Tyr Ser Lys Asn Glu Glu 

485 490 495 



Glu Leu Arg Lys Asn Glu Leu His His Lys His Trp Ser Asn Pro Met 
45 500 505 510 

Glu Val Pro Leu Pro Glu Ala Pro His Met Lys He Tyr Cys He Tyr 
515 520 525 

50 Gly Val Asn Asn Pro Thr Glu Arg A.la Tyr Val Tyr Lys Glu Glu Asp 
530 535 540 



Asp Ser Ser Ala Leu Asn Leu Thr He Asp Tyr Glu Ser Lys Gin Pro 
545 550 555 560 

Val Phe Leu Thr Glu Gly Asp Gly Thr Val Pro Leu Val Ala His Ser 
565 570 575 



Met Cys His Lys Trp Ala Gin Gly Ala Ser Pro Tyr Asn Pro Ala Gly 
60 580 585 590 
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lie Asn Val Thr lie Val Glu Met Lys His Gin Pro Asp Arg Phe Asp 
595 600 605 

lie Arg Gly Gly Ala Lys Ser Ala Glu His Val Asp lie Leu Gly Ser 
610 615 620 

Ala Glu Leu Asn Asn Tyr lie Leu Lys lie Ala Ser Gly Asn Gly Asp 
625 630 635 640 

Leu Val Glu Pro Arg Gin Leu Ser Asn Leu Ser Gin Tirp Val Ser Gin 
645 650 655 

Met Pro Phe Pro Met 
660 
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ssquench: listikg 

<110> Scymnc Dr. , Seen 
<120> 



<130> 

< J.4U> 
<2X0> 1 

<2xx> isas 

<212> genomic DN7V 

<213> Saccharomyces cerevisi^e 

<220> 

<221> CDS 

<?.22> (1) , . (1S83) 

<4Q0> 1 

*^^S SS'= acs. ccg- ttt cga a^a aat: gtc c?-S aac caa sag agt gat tec 4 8 
Met Gly Thr Leu Phe Arg Arg Asn VslI Gin Asn Gin Lys Ser Asp Scr 
15 10 15 

gat gaa a.ac aat aaa ggg ggt tct gtt cat aac aag cga gag age aga 
Asp Glu Atrn Asn Lye Gly Gly Ser Val His Asn Lys Arc Glu £er Arg 
20 2S 30 

aac cac att cat cat caa cag gga tts. ggc cat aag aga aga agg ggt 144 
Asn. His lie His His Gin Gin Gly hen Gly His Lys Arg Arg Arg , Gly 
35 40 45 

att age ggc agt gca aaa aga aat gag cgt ggc aaa oac tCc gac agg X32 
He Ser Gly Ser Ala Lys Arg Asn Glu Arg Gly Lys Asp Phe Asp Arg 
SO 55 60 

aaa sga gac ggg aac ggt aga aaa cgt tgg aga gat tec aga aga etc 240 
Lys Arg Asp Gly Asn Gly Arg Lys Arg Trp Arg Asp Sc>r Axg Arg Leu 
S5 70 75 SO 

att ttc att ctt ggt gca ttc tta ggt gta ctt ttg ccg Ctt age ztt 288 
lie Phe lie Leu Gly A?,a Phe Leu Gly Val Leu Lsu Pro ?he Ser Phe 
35 SO 55 

ggc get tat cat gtt cat aat age gat age gac ttg ttt gac aac ttt 33 S 
Gly Ala Tyr His Val His Asn Sar Asp Ser Asp Leu Phe Asp Asn Phe 
100 lOS 110 



40/59 



gta aat ttt gat tea etc aaa gtg tat ttg gac gat tgg aaa gat gtt 384 
Val Asn Phe Asp Scr Leu Lys Val T/r Leu Asp Asp Trp Lya Asp Val 
115 X20 125 

etc ccc^ caa. ggt aca agt teg ctt acc gat gat- ate cag get ggt aac 432 
Leu Pro Gin. Gly lie Ser Ser Ph« lie Asp Asp lie Gin Ala Gly .Asn 
130 135 140 

tac tec aca tct ccc tta gat gat ccc agt gaa aat ttt gee gtt ggt -iSO 
Tyr Ser Thr Ser Ser Leu Asp Asp Leu Ser Glu Asn Pha A.la Val Gly 
145 ISO 1S5 150 

aaci caa etc tta cgt gat tat aat ace gag gee aaa cat cjtju yi-u yut» sso 
Ly:; Gin Leu Leu Arg A.=p Tyr Asn He Glu Ala Lys His Pro Val Val 
Ids 170 175 

atg get cct ggt gtc stt tct acg gga. att gaa age tgg gga gtt att 57a 
Met Val I>ro Gly Val He Ser Thr Gly He Glu Ser Trp Gly Val Xle 
ISO nas 150 

gg-a g-ac gat gag cgc rjstt ^nr, geg eat ttt cgt aaa cqcj ctg tgg ^24 

Gly Asp Asp Glu Cys Asp Ser Ser Ala His Ph£i Arg Lys Arg Leu Trp 
125 200 20s 

gga agt ttt cac atg etc aga aca atg gtt atg gat aaa gtt tgt tgg S72 
Gly Ser Pb.e Tyr Met Leu A^g Thr Met Val Met Asp Lys Val Cys Trp 
210 215 220 

ttg a.aa cat gta atg tta gat cct gaa aca ggt ctg gac cca ccg aac 720 
Leu Lys His Val Met Leu Asp Pro Glu Thr Gly Leu Asp Pro Pro Asn 
225 230 235 240 

ttt acg eta cgt gca gca cag ggc ttc gaa tea act gat tat ttc ate 7oa 
Phe Thr Leu Arg Ala Ala Gin Gly Phe Gin Ser Thr A,sp Tyr Pha Xle 
2<L5 250 255 

gca egg tat tgg att tgg aac aaa gtt ttc caa aat ctg gga gta att SIS 
Ala Gly Tyr Trp Xle Trp Asn Lys Val Phe Gin Asn Leu Gly Val He 
2€0 265 270 



ggc tat gaa. ccc aat aaa atg acg agt get gcg tat gat tgg agg ctt 
Gly Tyr Glu Pro Asn Lys Met Thr Ser Ala Ala Tyr Asp Trp Arg Leu 
275 280^ 235 



8^4 
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cjca cac tea gat cca caa aga cgc gat sag tac ctt acg aag eta aa.g S15 
Ala. Ty^^ -^sp Leu Glu Arg Arg Asp Arg Tyr Phe Thr Lys Leu Lys 

290 2BS 300 

gas caa atrc gaa ctg ttt cat caa ttg agt ggt gaa aaa gtt tgt tta SSQ 
Glu Gin lie Glu Leu Phe His Gin Lau Scr Gly Glu Lys Val Cys Leu . 
305 310 31S 320 

act gga cat tct atg ggt cct cag att ate ttt tac ttt atg aaa tgg IQoa 
lie Gly His Ser Met Gly Sar Gin lie lie Phe Tyr Phe Met Lys Trp 
32S 330 33S 

gtc gag get gaa ggc cct ctt tac ggt aat ggt ggt cgt ggc tgg gtt 105S 
Val Glu Ala Glu Gly Pro Leu Tyr Gly Ai^n Gly Gly Arg Gly Trp Val 
3'10 345 3SQ 

aac gaa cac ata gac tea etc att aat gca gca ggg acg ctt ctg ogc 1X04 
Asn Glu Hie lie Asp Ser Phe 11^ Asn Ala Ala Gly Thr Leu Leu Gly 
355 360 3SS 

get cca aag gca get cca get eta utt agt ggt gaa atg &aa gat acc 1152 
Ala Pro Lyc A,la Val Pro Ala Leu lie Ser Gly Glu Met Lys Asp Thr 
370 375 3S0 

att caa tta aat acg tea gcc atg tat ggt ttg gaa aag tec ttc eca 1200 
lie Gin Xieu Asn Thr Leu Ala Met; Tyr Gly Leu Glu Lys Phe Phe Sax 
385 330 325 ^00 

aga att gag aga gea aaa atg tta caa acg tgg ggt age ata cca tea 124a 
Arg lie Glu Arg Val Lys Mat Leu Gin Thr Trp Gly Gly lie Pro £cr 
405 ^10 415 

atg eta cca aag gga gaa gag gtc att tgg ggg gat atg aag tea e.ct 1236 
Meat Leu Pro Lys Gly Glu Glu Val He Trp Gly Asp Met Lys 5er Ser 
420 425 430 

tea gag gat gca ttg aat aac aac act gac aca tac ggc aat ttc act 13 44 
Ser Glu Asp Ala Leu Asn Asn Asn Thr Asp Thr Tyr Gly Asn Phe He 
/^3S 440 

cga ect gaa agg aat acg age cat get ttc aac aaa aat ttg aca atg X3 52 
Arg Phs Glu Arg Asn Thr Ser Asp Ala Phe Asn Lys Asn Leu Thr Met 
450 ^S5 



aaa gac gcc att aac atg aca tta teg ata tea cct gaa tag etc caa 
Lys Asp Ala He Asn Met Thr Leu Ser lie Ser Pro Glu Trp Leu Gin 
4S5 ' 470 475 480 



1440 
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acja aga gta cac gag cag tac teg ttc ggc CaC Ccc aag aac gaa gaa 14S0 
Arg Arg Va.1 His Glu Gin Tyr Ser Phe Gly Tyr Scr Lys Asn Giu Glu 
435 490 4SS 

gag ccsi aga s.aa aac gag eta esc cac aag cac tgg teg aat cca atg 153^ 
Glu Leu Arg Lys Asn Glu Lsu His His Ijys His Trp Ser Ann Pro Met 
SOO 505 SIO 

gaa gta cca ctt cca gaa get: ccc cac atg aaa acc tut tgt ata tac ISS-i 
Glu Val Pro Leu Pro Glu Ala ?ro His Met Lys He Tyr Cys lie Tyr 
SIS 520 S25 

959 S^S aac cca act gaa. agg gca tat gta tat aag gaa gag gat 1532 

Gly Val Asr\ Asn Pro Thr Glu Arg Ala Tyr Val Tyr Lys Glu Glu Asp 
530 S35 540 

gac tec tct get ctg aat ttg acc acc gac cac gaa age aag caa cct 1680 
Asp scr Seir Ala Leu Asn Leu Thr Xle Asp Tyr Glu Ser Lys Gin Pro 
S'lS 5S0 SS5 560 

gta tec etc acc gag ggg gac gga acc gtt ccg etc gtg gcg cat tea 1728 
Val Phe Leu Thr Giu Gly Asp Gly Thr Val Pro Leu Val Ala His Ser 
56S 570 S7S 

atg tgt: cac aaa egg gcc cag ggt get tea ccg tac aac cct gcc gga 177^ 
Met cys His Lys Trp Ala Glu Gly Ala Ser Pro Tyr Asn Pro Ala Gly 
SSO BBS 550 

att aac gtt act att gtg gaa atg aaa cac cag cca gat cga ctt gat 1824 
lie Acn val Thr He Val Glu Met Lys; Kis Gin Pro Asp Arg Phs Asp 
S55 600 SOS 

ata cgt ggc gga gca aaa age gcc gaa cac gte. gac ate etc ggc age 1872 
lie Arg Gly Gly Ala Lys Ser Ala Glu Kis Val Asp Xle Lau Gly Ser 
SIO €1S £20 

gcg gag ttg aac gat tac ate ttg aaa att gca age ggt aat ggc gat 1S20 
Ala Glu Leu Ash Asp Tyr lie Leu Lys: Xls Ala Ser Gly Asn Gly Asp 
62S 63Q €35 540 



etc gtc gag cca cgc caa ttg cct aat ttg age cag tgg gtt tct cag 
Leu Val Glu Pro Arg Gin Leu Ser Asn Lau Ser Gin Trp Val Ser Gin 
€45 €50 ^55 
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XSBS 



at9 ccc ttc cca atg 
Hec Pro Phe Px-o Met 
660 
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<211> SSI 
<212> PRT 

<21Z> Saccharomycer csrevisiae 
<400> 2 

M^t Cly Thr L.cu Phe Arg Arg Asn, Val Gin ?i.sn Gin Lyc Ssr Asp'ser 

15 10 IS 

Asp Glu Ann Asn Lys Gly Gly Ser Vs.1 His Am Lya Arg Glu Ser Aro 

20 25 30 

Asn His lie Kis His Gin Gin Gly Leu Gly His Lys Arg Arg Arg Gly 

IS 4Q 45 

lie Ser Gly Sar Ala Lys Arc Asn Glu Arg Gly Lys Asp Phe Asp Ara 

SO 55 €0 

l^ysi Arg Azp Gly Asn Gly Arg Lys Arg Trp Arg Asp Ser Arg Arg Leu 
eS 70 75 30 

lie Phe lie Leu Gly Ala Phe Leu Gly Val Leu Leu Pro Phc Ser Phe 

83 BO 25 

Gly Ala Tyr Kis Val His Asn Ser Asp Scr Asp L^u Phe Asp Asn Phe 

100 105 110 

Val Am Phe Asp Ser Leu Lys Veil Tyr Leu Asp Asp Trp Lys Asp Val 

115 120 12S 

Lqu Pro Gin Gly lie Ser Ser Phe He Asp Asp He Gin Ala Gly Asn 

130 135 140 

Tyr Ser Thr Ser Ser Leu A.sp Asp Leu Ser Glu Asn Phe Ala Val Gly 
145 150 155 1^0 

Lys Gin Leu Leu A.rg Asp Tyr Asn Xle ^Glu Ala Lys His Pro Val Val 

1(55 170 175 

Ket Val Pro Gly Val lie Ser Thr Gly -Xle Glu Ser Trp Gly Val lie 

ISO 185 IS^O 

Gly Asp Asp Glu Cys Asp Ser Ser Ala His Phe Arg Lys Arg Leu Trp 

195 200 205 

Gly Ser Phe Tyr Met Leu Arg Thr Met Val Met Asp Lys Val Cys Trp 

210 215 220 

Leu Lys Kis Val Met Leu Asp Pro Glu Thr Gly Leu Asp Pro Pro Asn 
225 230 -235 240 

Phe Thr Leu A.rg Ala Ala Gin Gly Phe Glu Ser Thr Asp Tyr Pha He 

245 250 255 

Ala Gly Tyr Trp He Trp Asn Lys Val Pha Gin Asn Leu Gly Val He 

250 2o5 270 

Gly Tyr Glu Pro A3n Lys ^^e^: Thr Ser A.la Ala Tyr Aep Trp A.rg Leu 



Glu Gin He Glu Leu Phe His Gin Leu Ser Gly Glu Lys Val Cys Leu 



275 280 
Ala Tyx Leu Asp Leu Glu Arc Arg Asp 
290 -255 



285 



A-rg Tyr Pha Thr Lys Leu Lys 
300 



30S 



310 



315 



320 
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XlCi Giy His Scr 

Val Glu Aln Gill 

Asa Glu His lie 
355 

Ala Pro lays Ala 
270 

lie Gin Leu Asn 
385 

Arg lie Glu Arg 

Het Leu Pro Lyc 

Ser Glu Asp Ala 
425 

Arg Phc Glu Arg 
450 

Lyss Asp Ala Xle 

Arg Arg Val Kis 

Glu Leu Arg Lys 
500 

f^lu Val Pro Leu 

Gly Val Asn Asn 
550 

Asp Sax Ssr Ala 
545 

Val P^.e Leu Thr 

Ket Cyg His Lys 
sao 

lie Asn Val Thr 
525 

He Arg GXy Gly 
€10 

Ala Glu Leu Asa 
£25 

Leu Val Glu Pro 
Met Pro Phe Pro 



Met Gly Ser Gin 
325 

Gly Pro Leu Tyr 

Asp Ser Phe He 
3S0 

Val Pro Ala Lau 
375 

Thr Leu Ala Met 
Val Lys Met Leu 

405 

Gly Glu Glu Val 

Leu Asn Asn Asn 

Asn Thr Ser Asp 
455 

Asn Met Thr Leu 
470 

Glu Gin Tyr Ssr 
435 

Asn Glu Leu His 

Pro Glu Ala Pro 

pro Thr Glu Arg 
535 

Leu Asn Leu Thr 
S50 

Glu Gly Asp Gly 
S65 

Trp Ala Gin Gly 

Xle Val Glu Met 
€00 

Ala Lys ser Ala 
615 

Asp Tyr Xle Leu 
63 Q 

Arg Gin Leu Ser 
545 

Mac 



Xle Xle Phc Tyr 
330 

Gly Asn Gly Gly 
345 

Asn Ala. Ala Gly 

lie Ser Gly Glu 
330 

Tyr Gly Leu Glu 
3S5 

Gin Thr Trp Gly 

410 

He Trp Gly Asp 
425 

Thr Asp Thr Tyr 

Ala Phe Asn Lys 
460 

Ser Xle Ser Pro 

47S 

Phe Gly Tyr Ser 
490 

His Lys His Trp 
505 

KXG Het Lys Xle 

Ala Tyr Val Tyr 
540 

Xle Asp Tyr Glu 
SS5 

Thr Val Pro Leu 
570 

Ala Ser Pro Tyr 
535 

Lys His Gin Pro 

Glu His Val Asp 
€20 

Lys Xle Ala Ser 
^35 

Asn Leu Ser Gin 
650 



Phe Met Ly::: Trp 
335 

Arg Gly Trp Val 
3S0 

Thr Leu Leu Gly 
355 

Met Lys Asp Thr 

Lys Phe Fhe Ser 
400 

Gly Xle Pro ser 

415 

Met Lys Ser Ser 

430 

Gly Asa Phe lie 
445 

Asn Leu Thr Met 
Glu Trp Leu Gin 

4S0 

Lys Asn Glu Glu 
435 

Ser Asn Pro Net 
510 

Tyr Cys Xle Tyr 
Lys Glu Glu Asp 

Ser Lys Gin Pro 
560 

Val Ala Kis Ser 
575 

Asn Pro Ala Gly 
520 

Asp Arg Phe Asp 

€05 

Xle Leu Gly Ser 

Giy Asn Gly Asp 

Trp Val Ser Gin 
S55 
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<2ic> 3ir 

<211> 2312 

<212> genomic DMA 

<213> Schisosaccharocayces pomhe 

<400> 3 



ATGGCGTCTT 


CCAAGAAOAG 


CAAAACTCAT 


A^'.GA-AAAAGA 


AAGAAGTCAA 


SO 


ATCTCCTATC 


GA.CTTACCAA 


ATTCAAAGAA 


ACCAACXCGC 


G CTXTG AGXG 


100 


AGCAACCTTC 


AGCGTCCGAA 


ACACAATCTG 


XXXCA-AAXAA 


ATCAA.GAAAA 


150 


TCTAAA.TTTG 


GAAAAAGATT 


GAATTTTATA 


XTGGGCGCTA 


TXXXGGGAAX 


2O0 


ATGCGGTOCT 


TTTTTTTTCG 


CTGTTGGAGA 


CGACAAXGGX 


GXTXXCGACC 


250 


CTGCTACGTT 


AGATAAATTT 


GGGAATATGG 


XAGGCXCXXC 


AGACXXGXXX 


3 OD 


GATCACATTA 


AACGATATTT 


ATCTTATAAT 


GXGXXTAAGG 


ATGCACCXXX 


350 


TACTACGGAC 


AAGCCTTCGC 


AGTCTGCTAG 


CGGAAAXGAA 


GTXCAAGXXG 


400 


GTCTTGATAT 


GTACAATGAG 


GGATATCGAA 


GXG ACC AT CC 


TGTXAXTAXG 


450 


GTTCCTGGTG 


TTATCAGCTC 


AGGATTAG-AA 


AGXTGGXCGX 


TTAAXAAXXG 


5QQ 


CTCGATTCCT 


TACTTTAGGA 


AACGTCTTTG 


GGGXAGCXGG 


XCXAXGCTGA 


550 


AGGCAATGTT 


CCTTGACAJ^^G 


CAATGCTGGC 


XXGAACAXXX 


AAXG CXXGAX 


500 


AA>J^Ai?iACCG 


GCTTGGA-TCC 


GAAGGGAATT 


AAG CX G C GAG 


CAGCXCAGGG 


63 0 


GTTTOAAGCA 


GCTGATTTTT 


TTA.XCACGGG 


CXAXXGGAXX 


TQGAGTAAAG 


7 00 


TAATTGAAAA 


CCTTGCTGCA 


ATTGGTTAXG 


AGCCXAAXAA 


CAXGTTAAGT 


7SQ 


GCTTCTXACG 


ATTGGCGGTT 


AXCAXAXGCA 


AAXXXAGAGG 


AACGXGAXA-?. 


SCO 


ATATTTTTCA 


AAGTTAAAAA 


TGXXCATXGA 


GTACAGCAAC 


AXTGXACATA 


sso 


AG AAAAAG GT 


AGTGTTGATT 


TCXCACXCGA 


TGGGTXCACA 


GijXXACGXA-C 


50 0 


iAXTi i rrxA 




AGCrrUAGGGC 


•T* i 1i T*!^ 

XAC\jGrv-»^i. « 




0 S 0 


TTGGGTTAAT 


GATCATA.TTG 


AAGCATXTAX 


AAATGXGAGX 


CXCGAXGGXX 


1000 


GTTTGACTAC 


GTTTCTAACT 


XTXGAAXAGA 


XAXCGGGAXC 


TXXGAXXGGA 


1O50 


GCACCCAJ=^-AA 


CAGTGGCAGC 


OGXXTTAXCG 


GGXGAAAXGA 


AAGAXACAGG 


1100 


TAT-rGTAATT 


ACATTAAACA 


XGXTAATAXX 


XAATXXXXGC 


XAACCGTTXX 


iiso 


;^^,aCTCAATT 


GAATCAGXTT 


XCGGTCTAXG 


GGXAAGCAAT 


AAAXXGXXGA 


1200 


GATTTCiT'1'^i.C 


TAATTTACXG 


XXXAGTTXaG 


AAAAATTXXX 


TXCCCGTTCX 


n 7.'^n 


GACGTATATT 


CAAAAATACA 


MTGXGCXCX 


ACXTTTTGTA 


ACXTTXAAXA 


1300 


GAGAGCCATG 


ATGGTTCGCA 


CXATGGGAGG 


AGTTAGTXCT 


ATGCXXCCXA 


1330 


AAGGAGGCGA 


TOTTGTATGG 


GGAAATGCCA 


' GXTGGGXAAG 


AAATAXGXqc 


1400 
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tgtt^atttt: ttatt^atat ttaggctcca 
tttttccaat ggtgcaatta ttcgatatag 
acgatga.a.tt tgacatagat gatgcattac 
gatgacgatt ttaaagtcat gctagcgaaa 
ttggactgaa a.aagaagtgt taaaaaataa 
taaal'ccgct agaa.gtaaga acattaa-^gt 
aatagactag tcttccttat gctcctcata 
cgggtcgg.aa aaccaj^ctga gagaggttat 

GGGGCP-ACCT GTCATTGATT CCTCGGTTAA 
ATGTGAGAGA ATTTATGTTT CAAACATTCT 
ATTGTTATGG ATGATCGTGA TGOAACTTTA 
GGTGTGCAAT A?^J2.GTTTGGC AAACAAAAAG 
GTATCACAAA TTATGAAATC AAGCATGAAC 
QGI\GaP.CCrC GCTCGGCAGA ACACGTCGAT 
Aa_ATGTATGT TCATTTTTACC TTACAAATTT 
AAGGAAATTA TTTTAAAAGT TTCATCAGGC 
CCGTTATATA TCAGATATCC A.GTACGGACA 
TZNACTAACTA ACCGAACAGG GA-AATAAT.AA 
CCTAGAAATT AA 



GATGATCTTA ATCAAACAAA 14 5 0 

AGAAGACATT GATAAGGACC 15 QQ 

AATTTTTAAA AAATGTTACA IS 5 0 

AATTATTCCC ACGGTCTTGC 1€0Q 

CGAAATGCCG TCTAAATGGA iSSO 

TACTAAATTA TACTAACCCA 17 0 0 

TGAAAATTTA TTGCGTTCA.C 1750 

TATTATACTA ATAATCCTGA IS 0 0 

TGATGGAACA AAAGTTGAAA ISSO 

ATTAACTGTT TTATTAGGGT IS? 00 

CCAA.TATTAG CCCTTGGTTT ISSO 

GTTTAATCCT GCTAATA.CAA rSOOO 

CTGCTGCGTT TGATCTGAGA 2 05 0 

ATACTTGGAC ATTCAGAGCT 210 0 

CTATTACTAA CTCTTGAAAT 215 0 

CATGGTGACT CGGTACCAAA 22 0 0 

TAAGTTTXGT AGATTGCAAT 2 25 0 

ATGAGATAAA TCTCGATAAA 2 3 00 

2312 
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<211> 3€as 

<212> genomic DMA 

<213> Arabidopsis thalians^ 





ATGCCCCTTA 


TTCATCGGAA 


AAAGCCGACG 


GAGAAACCAT 


CGACGCCGCC 


SO 




ATCTGAAGAG 


GTGGTGCACG 


ATGAGGATTC 


GCAAAAGAJiA 


CCACACGAAX 


100 




CTTCCAAATC 


CCACCAXAAG 


AAATCGAACG 


GAGGAGGGA\ 


GTGGXCGTGC 


ISO 




ATCGATTCTT 


GTTGTTGGTT 


CATTGGGTGT 


GTGXGXGXAA 


CCXGGXGGXX 


200 




TCTTCTCTTC 


CTTTACAACG 


CAATGCCTGC 


GAGCTXCCCT 


CAGXAXGTAA 


250 




CGGAGCGAAT 


CACGGGTCCT 


TTGCCTGACC 


CGCCCGGTGT 


XAAGCXGAAA 


30Q 




A^\GAAGGTC 


'TTAAGGCG.AA 


ACATCCTGTT 


GXCXXCATXC 


CTGGGATTGX 


3S0 


;| 


CACCGGTCGG 


CTCGAGCXTT 


GGGAAGGCAA 


ACAAXGCGCT 


GATGGXXXAT 


400 


id 


TTAGAAAACG 


TTTGTGGCGT 


GGAACTTTTG 


GXGAAGXCXA 


CAAA-AGGXGA 


450 




GCTCAACAAT 


TCTCACTCTT 


CCTTTATATT 


GGGATXXGGA 


XTGGAXCTOA 


500 




TGAGATCACG 


CACTTGTTQC 


TTCTTCAACA 


XCACTCAAAC 


XXXAAXXCCA 


550 




TGTTTGTCTG 


TCTTACTCTT 


TACTTTTTTT 


TXTXTXXGAT 


GXGA?lACGCX 


SOQ 




ATTTTCa"X.a^»i 


GAGACTATTT 


CTGTATGTGX 


AAGGTAAGCG 


XXCCAAGGAC 


sso 




GXAAITGGCT 


TGGACTATTT 


CTGTTXGATT 


GXXAACX^XTA 


GGAXATAAAJa, 


700 




TAGCTGCCTT 


GGAATTTCAA 


GXCAXCXTAX 


XGCCAAAXCX 


GXXGCTAGAC 


750 




ATGCCCTAGA 


GTCGGTTCAT 


AACAAGTXAC 


XXCGXXXACX 


GXCGXTGCGX 


300 




GTAGATTTAG 


CTTTGTGTAG 


CGXAXAATGA 


AGXAGXGXTX 


XAXGXXXTGX 


S50 




TGGGAATAGA 


GPuAGTTCTAA 


CTACATCTGX 


GGAAAGTGXG 


XTCAGGCXGX 


200 




GATAGAGOAC 


TGTTGCTTTA 


XXATTCAACX 


ATGXATAXGX 


GXAAXXAAAG 


550 




CTAGTTCCTT 


TTTGATCXTT 


CAGCXCAATG 


XGGTXXXCXC 


AAXXXXXXTC 


1000 




TCAATTTCAA 


AGTTTC2iCAT 


CGAGXXXAXT 


CACAXGXCXT 


GAAXTXCGXC 


lOSQ 




CArCCTCGXT 


CTGTTATCCA 


GCTXXGAACX 


CCTCCCGACC 


CXGCXAXGGA 


1100 




TATATTAAAA 


AAA.BAGTGTT 


XXGXGGGXTG 


CATGXXTGXX 


ACGATCTGCA 


1150 




XCTICTTC-rr 


TCGGCTCAGT 


GTTCATGXTX 


XTGCTAXGGX AGAQAXGGGG 


1200 




jaATGTTATTG 


TTGATGGTAA 


CAQTGGXATA 


GXXGAXAGXA 


XCTTAACXAA 


12S0 




TCAATTATCT 


CXTTGAITCA 


GGCCXCTAXG 


XTGGGTGGAA 


CACAXGXCAC 


1300 




TTGACAATGA 


AACTGGGTTG 


GATCCAGCXG 


GTAXXAGAGT 


XCGAGCXGXA 


X3SQ 
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'CCAGGACTCG TGGCTGCTGA CTACTTTGCT CCTGGGTACT TTGTCTGGCC 14 0 0 

AGTGCTGATT GCTAACCTTG CA.C AT ATTGG ATATGAAGAG A^AAATATGT 14 5 0 

ACATGGCTGC ATATGACTGG CGGCTTTCGT TTCAGAACAC AGAGGTTCTT 15 0 0 

TTGTCATCGT TCTTXCTATT ATTCTGTTCC ATGTTACGTT TCTTTCTTCA 15 5 0 

TTACTTAAGG CTXAAATATG TTTCATGTTG AATTAATAGG TACGTGATGA 2^0 0 

GACTCTTAOC CGTATGAAAA GTAATATAGA GTTGATGGTT TCTACCAACG IS 5 0 

GTGGAAAAAA AGCAGTTATA GTTCCGCATT CCATGGGGGT CTTGTArTTT 17 0 0 

CTACATTTTA TGAAGTGGGT TGAGGCACCA GCTCCTCTGG GTGGCGGGGG 175 0 

TCGGCCAG AT TGGTGTGCAA AGTATATXAA GGCGGTGATG ^J^CATTGGTG IS 00 

GACCArrXGT TGGTGTTCCA AAS^GCTGTTG CAGGGCTTTT CTCTGCTGA>. 185 0 

GCAaAGGATG TTGCAGTTGC CAGGTATTGA ATATCTGCTT ATACTTTTGA 19 0 0 

TGAtCACAAC CTTCGGTCTG GAACTCAAAG TTATTGTACT AAATATCAAT 155 0 

TCTAATAACA TTGCTATATT ATCGCTGCAA CTGAGATTGG TtGATTATTT 2 0 00 

XTGCTGCTTA TGTAACTGAA ACTCTCTTGA GATTAGACAA ATGATGAATT 2 05 0 

GATAATTCTT ACGCATTGCT CTGTGATGAC CAGXTTGTTA GCXTCGACGA 210 0 

XAACAXXXGT CAXACXGXCX XXTGGAGGGC ATXGAAXTTX GCTAXGGAAA 215 0 

GCGCTGGAGC TTCCATGCXX GCATTCXXXA CCAAXTAGCG XTAXXCXGCX 22 00 

TCXTXC^iAXX TXCXXGXAXA TGCATCTAXG GTCTTTXATT XCTXCXXAAX 225 0 

XAAAGACTCG TXGGAXTAGX XGCXCTATTA GXCACXXGGT XCCTXAAXAT 23 0 0 

AGAACXXTAC TTTCTXCGAA AAXXGCAGAG CGAXXGCCCC AGGAXXCTTA 2 3 50 

GACACCGaXA TAXTTAGACT XCAGACCTXG CAGCAXGTAA TGAGAAXGAC 2 4 00 

ACGCACAXGG GACXCAACAA TGXCXAXGXT ACCGAAGGGA GGXGACACGA 2^5 0 

TAXGGGGCGG GCXXGAXXGG TCACCGGAGA AAGGCCACAC CXGXTGXGGG 25 0 0 

AA.AAAGCAA.A AGAACAACGA AACXTGTGGX GAAGCAGGXG AAAACGOAGT 2 55 0 

XXCC^AGAA^ AGXCCXGXTA ACTAXGOAAG GAXGAXATCX XXXGGGAAAG 2 500 

AAGTAGCAGA CGCTGCGCCA XCTGAGATXA AXAAXATXGA TXXXCGAGXA 2 550 

AGGACAXAXA AAXCATAATA AACCTTGXAC AXXXXGXGAX TGTAXGAXGA 27 00 

ATATCXGXAC ATXXXATCXG GXGAAGGGXG CXGXGAAAGG TCAGAGXAXC 2 750 

CCAAA.XCACA CCTGXCGXGA CGXGTGGACA GAGTACCAXG ACAXGGGAAT 2 30 0 

TGCTGGGAXC AAAGCXAXCG CXGAGXATAA GGrC':S^C?>^CT GCXGGXGAaC 2SS0 
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CTA.TAGATCT ACTACATXAT GTTGCTCCTA 
GCTCATTTCT CTTATGGAAT TGCTGATGAT 
AGATCCCA^-A TACTGGTCAA ATCCGXTAGA 
GATTCCAACT GTATCCTTCG TCCTGATGCA 
GGTCTTGTTG GATATGGTTT TCAGCTCAAA 
GCCTTTCTGA AAAAGGCTTG CTCAGTAATA 
CATGTQACTC TTGCTTATAA ATCCTCCGTT 
GATTACCGAA TGCTCCTGAG ATGG;iJHATCT' 
AThCa^J^COG AACGAGCATA CGTATACAAG 
TTGCATCCCC TTTCAGATAT TCACTTCTGC 
GCTGTCTGAA AGCAGGAGTT TACAATGTGG 
GTCCTAA.GTG CCGGGTACAT GTGTGCAAAA 
ATTCAACCCT TCCGGAATCA AGACTTATAT 
CGCCGGCTAA CCTGTTGGAA GGGCGCGGGA 
GATATCATGG GAAACTTTGC TTTGATCGAA 
CGGAGGTAAC GGGTCTGATA TAGGACATGA 
TTGAATGGTC GGAGCGTATT GACCTGAAGC 



AGATGATGGC GCGTGGTGCC 2 90 0 

TTGGATGACA CCAAGTATCA 2 95 0 

GACAAAGTAA CTGATTTCTT 3 000 

TTAXCAGTCT TTTTGTTTTC 3 05 0 

GCTTACAAAG CTGTTTCTGA 310 0 

TTGAGGTGCT AAAGTTGATA 315 0 

TGGTTTGTTC TGCTTTirrCA 32 00 

ACTCATTATA CGGAGTGGGG 3250 

CTTTAACCAGT CTCCCGACAG 33 0 0 

TCACGAGGAG GACGAAGATA 33 50 

ATGGGGATGA AACAGTACCC 34 00 

GCGTGGCGTG GCAAGAC^AG 3 4 50 

AAGAGAATAC AATCACTCTC 35 00 

CGCAGAGTGG TGCCCATGTT 3SS0 

GATATCATGA GGGTTGCCGC 3S0Q 

CCAGGTCCAC TCTGGCATAT 3S50 

TGTGA asas 
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<210> sj^ 
<:211> 402 
<212> CDMA 

<:213> ?irabidop3is chaliana 

<22Q> 

<221> COS 

■<222> (120) , . (402) 
<400> S 

AGAAACji.GCTCTTTGTCTCT CTCGACTGATCTAACAATCC CTAATCTGTGTTCTAAATTC 50 

CTGGACGAGATTTGACi^AAG TCCGTATAGCTTAACCTGGT TTAATTTCAAGTGACAGAT 119 

ATC CCC CTT ATT CAT CGG AAA AAG CCG ACG CAG AAA CCA TCG ACG CCG 1(S7 
Met Pro heu tie. His Arg Lys Lys Pro Thr Glu Lys Pro Ser Thr Pro 

CCA TCT GAA GAG GTG GTG CAC GAT GAG GAT TCG CAA AAG AAA CCA CAC 215 
Pro Scr Glu Glu Val Val Kis Asp Glu Asp Scr Gin Lys Lys Pro His 

GAA TCT TCC AAA TCC CAC CAT AAG NAA TCG AAC GGA GGA GGG K^G TQG 263 
Glu ser Ser L.ys Ser Kis Kis Lys 7?7 Ser Asn Cly Gly Gly Lys Trp 

TCG TGC ATC GAT TCT TGT TGT TGG TTC ATT GGG TGT GTG TGT GTA ACC 311 
scr Cys lie Asp Ser Cys* Cys Trp Phe lie Gly Cys Val Cys Val Thr 

TGG TGG TTT CTT CTC TTC CTT TAG AAC GCA ATG CCT GCG AGC TTC CCT 3 59 
Trp Trp Phe Leu Leu Phe Leu Tyr Asn Ala Met Pro Ala Ser Phe Pro 

CAG TAT GTA ACG GAG CCG AAT CAC G^G TCC TTT CCC TTA CCC G 4 02 

'Gin Tyr Val Thr Glu Pro A.sn Kis ??? Ser Phe Ala. Leu Pro 
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<2io> eh" 

<212> CDMA 

<213> Zsa mays 

<220> 

<221> CDS 

<222> (1) , , (402) 

<<iOQ> S 

CGG GAG ;U^u.^ ATA GCT GCT TTG A-AG GGG GGT GTT TAG TTA GCC GAT GGT 4 8 
Arg Glu Lyc He Ala Ala Leu Lys Gly Gly Val Tyr -Leu Ala Asp Giy 
15 10 15 

GAT GAA ACT GTT CCA GTT CTX AGT GCG GGC TAC ATG TGT GCG AAA GGA BS 
Acp Glu Thr Val Pro Val Leu Ser Ala Gly Tyr Men Cys Ala Lys Gly 
20 25 30 

TGG CGT GGC AAA ACT CCT TTC AGC CCT GCC GGC AGC AAG ACT TAC GTG 144 
Trp Arg Gly Lys Thr Arg Phe Ser Pro Ala Gly Ser Lys Thr Tyr Val 
35 40 45 

AGA TAC AGC CAT TCG CCA CCC TCT ACT CTC CTG GAA GGC AGG GGC 192 

Arg Glu Tyr Ser His Ser Pro Pro Ser Thr Leu Leu Glu Gly Arg Gly 
SO 55 SO 

ACC CAG AGC GGT GCA CAT GTT GAT ATA ATG GGG A-AC TTT GCT CTA A.TT 24 0 
Th-L- Gin Ser Gly Ala His Val Asp He Moc Gly Asn Phc Ala Leu He 
65 70 7S eo 

GAG GAC GTC ATC AGA ATA CCT GCT GGG GCA ACC GGT GAG GAA ATT GGT 2SS 
Glu Asp Val He Arg He Ala Ala Gly Ala THr Gly Glu Glu He Gly 
85 90 95 

GGC GAT CAG GIT TAT TCA GAT ATA TTC AAG TGG TCA GAG AAA ATC AAA 336 
Gly Asp Gin Val Tyr Ser Asp He Phe Lys Trp Ser Glu Lys He Lys 
100 * 105 110 

TTG AAA, TTG TAA CCTATGGGAA GTTAAAGAAG TGCCGACCCG TTTATTGCGTTCC 3£?1 

Leu Lys Leu 
115 

>-^-AGTGTCCT GCCTGAGTGC AACTCTGGAT TTTGCTTAAA TATTCT^^ATT TTTCACGC 44 5 

TTCATTCGTC CCTTTGTCAA ATTTACATTT GACAGGACGC CAATGCGATA CGATGTTG 5 07 

TACCGCTATT TTCAGCATTG TATATTAAAC TGTACAGGTG TAAGTTGCAT TTGCCAGC 5 65 

TGAA-ATTGTG TAGTCGTTTT ' CTTTACGATT TAATANCAAG TGGCGGAGCA GTGCCCCA S23 



AGCNAAAAAA AAAAAAAAAA 



643 
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<210> 7/0- 
<211> lis 

<7-13> Zea mays 
<400> 7 



Arc3 Glu Lys lie Ala Ala Ijeu hys Oly Gly Val Tyr Leu Alai Asp Gly 
IS 10 IS 

Asp Glu Thr Val Pro Val Leu Ser Ala Gly Tyr Met Cys Ala Lys Gly 
20 25 30 



Trp Arg Gly Lys Thr Arg Phe Ser Pro A.la Gly Ser Lyc Thr Tyr Val 
3S 40 45 



Arg Glu Tyr Ser Kis 5er Pro Pro Ser Thr Leu Leu Glu Gly Arg Gly 
SO 55 60 



Thr Gin Ser Gly Ala His Va.1 Asp lie Met Gly Asa Phe Ala Leu XIg 
65 70 75 eo 



Glu Asp Val lie Arg lie Ala Ala Gly Ala Thr Gly Glu Glu lie Gly 
85 90 35 



Gly Asp Gin v^wl Tyr Ser Asp lie Phe Lys Trp Ser Glu Lys- lie Lys 
100 * 105 110 



Leu Lys Leu 
115 
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<210> Bp^ 
<212> cDNA 

<213> K«u2rospora cra-sna 
<40a> S 

GGTGGCGAAG ACG;:^IGGCGG AAGTTGGAGG 
XGJ^TGGATCT ACCCTCTAGA GACACGACTA 
GT^^TACNaTT TKTATGGGTA GGAAGCCGAG 
TGGCGCCCGA TCCCGGG^^CG ACAACGCATC 
ACTTTGACTN' AGGGGCACAT TGACCACCGT 
TGGCACAGTG A.?«,CCTTATGA GTTTGGGGTA 
XAATGAACAG ATACAATCCT GCGGGCTGAA 
CCGCAXGAAC CAGAACGGTT CAATCCGAGA 
CTTAA?i.TATG TAG-^-^-AAGGT TGAAATTTAT 
ACATAGGTTA CTCAATAGTA TGACTAATTA 
AAAAAAA-IJ^A AAA^^AA 



CTA,ACGAGAA TGACKCTCGG £0 

CCNTTGCACC CAGCCTCAAG 100 

GGAGCGAGCC TACATCTATC 150 

TTTAGXTGAC GATCGATACG 20 0 

GTGATTTTGG GCGAAGGCGA 25 0 

CCTGTGCAAT AAGGGGTGGA 3 00 

AAATAACCGT GGTCGAGATG 3 SO 

GGAGGGCCGA ATACGGCGGA 50 0 

GAAGAGTAAT TAAAT A.CGGC 550 

AJi-AAAAAATT TTTTTTCTAA 600 

616 
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<210> 

<:>!!> ZSG2 

<212> gsixomic DNA 

<213> Arabidopsis thaliaua 

<400> 9 

ATGAA.AAAA.2u T?.TCTTCACA TTATTCGGTA 
GGTGACGATG ACCTCGATGT GTCAAGCTGT 
TGATTCTGGT TCCAGGAAAC GGAGGTAACC 
AGAGAATACA AGCCAAGTAG TGTCTGGTGT 
TCATAAGAAG AGTGGTGGAT GGTTTAGGCT 
TATTGTCTCC CTTCACCAGG TGCTTCAGCG 

GACccrai^Ti: tggatgatta ccatvaatgct 

TCCTCATTTC GGTTCGACCA A\TCACTTCT 
GGTTAGTACT TTCCAAGATA ^TATCATTTTG 

t^aatagacat aaaxttgggg gattattgtt 
gctagtcggt aatgtgagtg ttatgttagt 
gtgattttcc a^tttta^u^tg a^,ggtaga?\a 

CTATGTCATG AaiVATTATAA GGACACTATG 

GGTTXGATXT GCAGAGATGC CACATCTTAC 

TCTAGAGAAA AAATGCGGGT ATGTTAACGA 

CATATGATTT GAGGTACGGC CTGGCTGCTT 

GCCTCACAGT TCCTACAAGA CCTCAAACAA 

CGAGAACGAA GGAAAGCCAG TGATACTCCT 

TTTTCGTCCT CCATTTCCTC AACCGTACCA 

TACATCAAAC ACTTTGTTGC ACTCGCTGCG 

TCAGATGAAQ ACATTTGGTT CTGGCAACAC 

ACCCTTTGCT GGTCAGACGG CATCAGAGGA 

CTACTTCCAT GTACC-AAAGT GTTTCAGGAC 

AJ^.CTCCCCZAG GTTAACTACA CAGCTTACGA 

ACATTGGATT CTCACAAGGA GTTGTGCCTT 



GTCATAGCGA TACTCGTTGT, SO 

GGGTAGCAAC GTGTACCCTT loO 

AGCTAGAGGT ACGGCTGGAC ISO 

AGCAGCTGGT TATATCGGAT 20 0 

ATGGTTCGAT GCAGCAGTGT 2S0 

ATCGAATGAT GTTGTACTAT 3 00 

CCTGGTGTCC AAACCCGGGT 35 0 

ATACCTCGAC CCTCGTGTCC 40 0 

GGACATTTGC ATAATGAACA 45 0 

ATATCAATAT CGATTTATAT 5 00 

ATAGTTAATG TGAGTGTTAT 55 0 

GTTGTCGTTT AATAATGTTG 60 0 

TAAATGTAGC TTAATAATAA 65 0 

AXGGAACATT TGGTGj:LMi.GC 70 0 

CCAAACCAXC CTAQGAGCIC 75 0 

CGGGGCACCG GTCCCGXGXA 800 

TTGGTGGAKP^ J^-AACXAGCAG 85 0 

CXCCCATAGC CXAGGAGGAC 90 0 

CCCCXXCATG GCGCCGCAAG 25 0 

CCAXGGGGXG GGAGGAXCXC 10 0 0 

ACXCGGXGXC GCXXXAGXTA 10 50 

CGTCCGAGAG XAACCAAT GG 110 0 

AGAACXAAAC CGCXXGXCGX 115 0 

GA,XGGAXCGG TTXTXXGCAG 12 0 0 

ACAAGACAAG AGXGXXGCCX 125 0 
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TTAACAGAGG AGCTGATGAC TCCGGGAGTG CCAGTCACTT GCATATATGG 13 0 0 

GAGAGGAGTT GATACACCGG AGGTTTTGAT GTATGGAAAA GGA.GGATTCG 13 50 

ATAAGCAACC AGAGATTAAG TATGGAGATG GAGATGGGAC GGTTAATTTG 1400 

GCGAGCTTAG CAGCTTTGAA AGTCGATAGC TTGAACACCG TAGAGATTGA 145 0 

TGGAGTTTCG CA.TA.CATCTA TA-CTTAAAGA CGAGATCGCA CTTAAAGAGA 15 0 0 

TTATGAAGCA GATTTCAATT ATTAATTATG AA.TTAGCCAA TGTTAATGCC 1550 

GTCAATGA-AT GA 
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<2103 10^ 

<2ia> Arabidopsis thaliana 
<400> IQ 



ATGGGAGCGA 
TTXTTTCTTG 
TTCACGGCGA 
TCGACGCAGC 
GG/\CTTCA.\f 
ATCTTCATTT 
AATTCCAAGC 
TCGXTCATTA 
A/^AACAGCrO 
TAGGGCAATG 
CAAGTTTTTC 
CTTGTGCATT 
TGGTTTACAG 
TAGATCCTTA 
AGTGGTCT-TT 
TTTCGGATTT 
TGTGATATAA 
TTTCTACTGT 
GAAGCAA/kTG 

caa^ttgg;^-A 

TCAGGCTA-AT 
TCTGGTCGTC 
CCGTGGCGGC 

tcagatactt 
ttgaagtggc 
cctactatcc 

TGTTGTGACT 
GAGCTCCTCT 
GTA^i.CGTTTG 
TTTAAGTAHT 
TGAAAGTATT 
TAGATCTTGA 
GTCAGCGGCC 
TTCAGATTAT 
CTTTXTATTT 
TGACA.TGCGC 
GGTTGTTGTC 
AAGAATTGCA 
TGCAAAGAAJi. 
CAAAATATTC 
A.GTAGCGGTT 
CCAJ^.GAATGT 
TACAGAAACA 
GCCTTTTGTC 
GGAA.TAGAAG 
GAAGTA.CGTA 
TTGCTGGCTT 
XGTTCATATG 
CCTTATTATT 
GTTTACAGTT 
A'l-TCTCTAGT 
TCTTTGTGAA. 
TGA-TGACCCT 
A-2uAwi.TGrATT 
ATTCTCAATA. 



ATTCGAAATC 
ATTTGCGGTG 
CTACrCGAAG 
TACGAGCGTG 
CCGCTCGACC 
CCTTCGCTCC 
GrV^ATATAGC 
GTCAAGAGTG 
ACTCGGCGAG 
AATGTGTAAT 
AGAGTGCTGA 
GTGATTCTTT 
CTTCTTTCTG 
TAATCAAACA 
CAGCCATCAC 
TTCTTTCT'rr 
TATGGCTAAG 
CTGG.AAAGAG 
CAATTGTCGC 
GAGCGTGACC 
GTCTTTTA.TC 
TTCCTTTTTG 
CCTTCTATAG 
TCTGG.AATGG 
TTGATCAGCA 
TTAAGTTACC 
TACXGGA.TTG 
TCTTGGTTCT 
GCCTTCCTGT 
TGATATC^lAC 
ACTTXTGTTA 
AGXGCTAGXT 
TXAGCTAAXA 
TAXGGXA.GAC 
TAAXAGGCXA 
XTCXCAXGXT 
CAAXXCTXTX 
AGGGXGAXAA 
GAXAAGCGCG 
rGGCTGGCCG 
A.aACXCTGTA 
TCACXCTCAT 
GCXCXAGXCA 
TTTCACAGCC 
ACTATGACCC 
CCXXXCTTTG 
CTTGTACGXC 
CTTTGXCTTX 
GATXATCAGT 
AXGAMGCAA 
TTGTXTXGAC 
TTA-XAXATAJi. 
GXTXTXAATC 
TTGCAXATAT 
TCACAXXAXG 



AGXAACGGCT 
GCCGAACXGC 
CTATPaOGTA 
GTCGAXCCTT 
TCGXATGGCX 
XXAXTCTGTC 
AATGAAGCAX 
ACGCXTCTGA 
TGTTXCCCAT 
TAGXCTGCGC 
ATAGXAGTTA 
TGGTTGTXGC 
CXGXCAACXG 
GACCATCCCG 
AGAATTGGAT 
TGAGXTTXCT 
XXCAXTAATT 
XGGCXTA-^GT 
.XGXTCCAXAC 
TTXACTTXCA 
XXGTCTTTTX 
CAGGTXGACC 
TAXXXGCCCA 
CTGAGGCXAG 
XAXCCAXGCT 
AXTTXAX'TTX 
AGGXCGATAC 
GXXGAGGCA^. 
TTCTGAGGTG 
CAGGTCTTAT 
AXXGAACTGC 
AXCAAAGAAC 
CAACCAAACC 
TTXAAGTTGA 
XGAXTTGTTX 
TXTTGTTGGC 
GCGXCGXCAX 
CACATTCTGG 
XATACCACTG 
ACAAATATTA 
TATGCAACTG 
AXTXCGTTCG 
ACA:t:GACCAG 
CGXGAACTAG 
AGATAGCAAG 
TGATAAGAAA 
AAATXGTXTT 
CTTACXATAA 
TCXCTCCTTA 
^'J^GGGGGXAT 
TAATAGCGTC 
CATGCTAACT 
CTCTGACXCC 
GGXGGXCATC 
CGXTGACTTX 



XCGXXCACCG 
GGXGGAGGAT 
XAAXCATXCC 
GAGTGTCGAX 
AGACACCACX 
GGTCGAGXCA 

QTCTcarcrc 

ATGTGAGTXT 
CGCXTTTGGT 
TXXTTATXCA 
GAAAATGXTA 
XXACXGATCG 
CXGGTTTAAG 
AGTGTAAGTC 
CCAGGXTACA 
TCAATTTGAT 
TGGTCAAXXX 
GGXGTGXTGA 
CAXTGGAGAT 
CAAGCTCAAG 
AXGTAAGAXA 
TTXGAAACTG 
XXCAATGGGT 
AAAXXGCACC 
XATTXCQCTG 
TTCTCTA.ATT 
CXGATTTGXT 
TCAAAXCTAC 
ACCTCTGACX 
AACTCACTGG 
TGXAGGCGAT 
ATATXGTGGG 
ACAXGXACAG 
GAAGAAACTX 
AXTGAAATCA 
AAGGCXXCAG 
XGXGGCXXAX 
ACGCAXTTTT 
TGATGAAGAG 
XXAACATTCA 
TAACACTAAC 
XXXGATGXGX 
CATGGAAXGX 
CAGATGGGAC 
A.GGATGXTAC 
TATXGCTCAT 
GTTXAAAXCT 
GAAA,CAAGXA 
XATTATGGAA. 
TXXAGTXGAT 
AAXTTTGTXX 
AXACTTTTCA 
XXGGGAGAGA 
XA,AA.GACAGA 
GXXAXXATAT 



XGAXCGCCGX 
GAGACCGAGX 

ACACXCCGXX 
AAGGXCCGXG 
CXXGXXGAXG 
XCTXAXXGAX 
AGAGXGAXAX 
XGGCXAAAXG 
ACTAGAXCTG 
GGXCAXTXXA 
ACGTGAXGGA 

ACGGCCTGAC 
XAACAGGXAG 
AXGAXCTXGX 
XCA.GGXCCXC 
GXXXGGXAXA 
TGTCACCAAC 
XXAGXCCXTA 

agctaagagc 

CXXTAAAACX 
AATAA.XGXCT 
AA^-ACAXXAX 
TXGGTACCGG 
GGGGGAGXTA 
GTTGATXXAG 
TCXCXCXGGX 
XCTCXTTAGX 
AXXXTCCTXT 
AXGGT?.XCXG 
XAGXATACCX 
TGAXTXAGTX 
XGACTGAJ\AX 
XGXGACATAT 
GGAACTGGXC 
GCCATTXXCA 
CTGGGGGTQC 
GAATAXCAAT 
AAXXCCTTCC 
AAAAGTXXCA 
AXCCAXCAGT 
GGCCTTCCCA 
TCXXXXCAAA 
ACCAGTXAAA 
CGAXCAXCAC 
CXATA.TGAAX 
XAATCAGAAA, 
XGXCXTTXTC 
TGATTCXCXC 
XTCXAGCAAA 
GGXXGTAXCA 
CCACCXAXAA 
GGXAXGATGC 
XCCCCA.XTXG 



50 
100 
150 
200 
250 
300 
3Sfl 
400 
450 
500 
SSO 
€00 
€S0 
700 
7S0 
SCO 
SSO 
900 
250 
1000 
1050 
1100 
1150 
1200 
1250 
1300 
1350 
1400 
1450 
lb 00 
1550 
ISQO 
1650 
1700 
1750 
ISOO 
1350 
1900 
1950 
200 
2050 
2100 
2150 
2200 
2250 
23 00 
2350 
2400 
2450 
2500 
2550 
2500 
2550 
2700 
2750 
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GTTTGCAATA TCTTTTTGr\i\ 
CCTATTAAOC GTXAAAGGTA 
GGTTA/CTACT TTGCCCCAAG 
CACGGAXATC ATTTATGAAA 
CCGCAATQGC ACAAGTAAAA 
GTGGCATG TT ATCTCAGTTQ 
AAGTACTTTT TTATCATTCC. 
AAGTGGGAAG AGGTGTTGCA 
AGCAAAACAA AACTAACCCA 
GTGCTTTTAA AAA-ATTTGTT 
GATTGTGCAA TATCTGCAGG 
CCTATAACTG GCGATGAGAC 
CTTCTTGCAA ACTACTGAAG 
CTTGCTATGT TCTCTAGTAC 
TGATTATGAA ATTGATCTCT 
GCAAGAATTG GCTCGGACCT 
CTGITTTTTA GTTCCTCACC 
CTGGTTATGT GTTGATTTAC 
TCTCTGTACT CCTCAAGAAC 
G.i^AAATAAAA CAACAGCCAG 
-TAAATGTTGA TCATGAGCAT 
GCACCAAGGG TTAAGTACAT 
GGGG?\AGAGA ACCGCAGTGT 



TTATGATTTA TCTTCTCCCT 
CTAAATGTAT GAAGCTGTCT 
TGGCAAACCT TATCCTGATA 
CTGAAGGTTC CCTCGTGTCA 
CAGGAAGGCA AAGTCTTCTG 
CATAAGCAAA TTATTAAACA 
TTTTGAGCTT AGTGGATGAT 
TGAAACATGA CACTTGTATC 
TTTCTGAATT TCATATTATT 
TTAAGAAACC GAAAAAGTAG 
TCTGGAACTG TGGTTGATGG 
GGTAAGCTGA GAAGTTGGTT 
ACTAAGATAA TACTTGCTTC 
ACTGCAATAT rOP^CTCTCCG 
TATAGGTACC CTATCATTCA 
AAAGTTAACA TAACAATGGC 
TTATATAGAT GAAACTTTAA 
CTCCAATTTG TTCTTTCTAA 
TTGTATTAAT CTAAACGAGA 
AACACGATGG AAGCGACGTA 
GGGTCAGACA TCATAGCTAA 
AACCTTTTAT GAAGACTCTG 
OGGAGCTTGA TAAAAGTGGG 



TG C/sT CTTAT 


2 5 0 0 


GTCATAGGTT 


2 S 5 Q 


ATTGGATCAT 




AGGTAATTTT 


2 S5 0 


TAT C ACj i w i A 


3 0 0 0 


ACTAAAATTT 


3 OS 0 


CAGTGGGTTA 


310 0 


AAA G AT AAGT 


3 150 


AGGAGTAGTC 


3 200 


XTC ATAT CTT 


3'25Q 


GAACGCTGGA 


3 3 0 0 


T TG AAATXAT 


3 3 50 




3 4 0 0 


ctacttttat 


3 4S0 


CTCTCTTGGT 


3500 


tccccaggta 


35S0 


GTGTACTTTT 


3SO0 


AAATCATATA 


3S5Q 


TTCTCATTGG 


3700 


CATGTGGAAC 


3750 


CATGACAAAA 


3S00 


AGAGCATTCC 


3S50 


TATTAA 


3895 
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<210> ixb" 
<211> 70£) 
<2ia> Ct)MA 
<213s ccrnaco 
<4Q0> 11 

CTGGGGCCAA. AAGTGAACAT AAGAAGGACA 
TCAGATGTAC AAGTGCATCT AAATATAGAG 
CATTCCCAAT ATGACAAAGT TACCTACAAT 
ACGATTCTCA AAGTTTTCCA GGGACAAGAA 
AAAGCAAATC ACAGGAACAT TGTCAGATCT 
GTGGCTTGAG ATGTGGCATG ATATTCATCC 
TTACAAAAGG TGGTGTCTGA TCCTCACTAT 
GT'rTGTATTG ACATTGTAAG TATTGCAACA 
TGAGGGATGA GGACTGCTAT TGGGATTACG 
GCTGAA.CATT GTGAATACAG GTTAGAATAT 
TATTCTCTITT TTGTGTATTT AGGCCACCTT 
GATATGTATT CGGGGATGTT CACCTGGGAC 
TCTACATCTC ACATCCTGTC ACACTATGTG 
TTGGCGGAAC ^^-ACAAGTTTG CACAAACATT 
TTCAGAGAG 



CCACAGTCAG AGCATGATGT ' SO 

CATCAACATG GTGAAGATAT iOQ 

GAAGTACATA ACCTATTATG 150 

CAGCAGTTTG GGAGCTTGAT 2 00 

CCACCTTTGA TGCGGGAGCT 2 50 

TGATAAAAAG TCCAAGTTTG 3 00 

TTTCTTCTAT A^J^TGTTTGA 350 

AAAAGCAAAG CGTGGGCCTC 40 0 

GGAAAGCTCG ATGTGCATGG 450 

TCA-^-AXTATA TTTTGCAAAA £00 

TCCCCGGTCA CAACGATGCA 55 0 

AGAGTTGCAG ATTGAAGAGT SQO 

TGATATTTAA GAAACTTTGT £5 0 

TGAAGAAGAA AGCGAAATGA 7 00 

705 
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